DOCUXENT RESUME 



SE 052 864 

Hedrich, Elliott, A.? Griffith, Jeanne E. 
International Hatheroatics and science Assessments: 
What Have We Learned? Research and Development 
Report . 

MPR Associates, Berkeley, CA. 

National Center for Education Statistics (ED) , 

Washington, ix;. 

NCES-92-011 

Jan 92 

152p. 

Reports - Evaluative/Feasibility (142) 
HF01/PC07 Plus Postage. 

•Evaluation; * international Cooperation; •Mathematics 
Achievement; Kathematics Education; Science 
Education; Surveys; World Affairs 
•Science Achievement 



This report addresses two related issues. First, it 
summarizes the past international studies of mathematics and science, 
describing each study smd its primary results. In place of country by 
country performance rankings, the report presents the average 
performance for each country accompanied by an estimate of the 
statistical error circumscribing the limits of meaningful 
country-to-count ry comparisons. Second, the report draws together 
critical and heretofore inaccessible documentation-information that 
scientists require to evaluate the quality of the surveys. The issues 
surrounding the collection and analysis of these data are also 
addressed. Further, it offers suggestions about ways by which new 
data collection standards could improve the quality of the surveys 
and the utility of the reports in the future. The report concludes by 
suggesting that there is a need for more deliberate consideration of 
policy concerns in the design of international assessments. This, in 
turn, may provide opportunities for policymakers and fducation 
practitioners to apply what is learned about cross-national 
differences in achievement to curriculum development and programming. 
Chapters include: (1) "Student Achievement in an International 
Context"; (2) "International Achievement Surveys of Mathematics and 
Science: An Overview"; (3) •'The International Achievement Studies: 
Mathematics and Science Scores"; (4) "What We Know about the 
Achievement Scores and Country Rankings: A Summary of Selected 
Results and Hypotheses from the International Survey s**; and (5) 
"Looking Ahead: Toward Future International Achievement Surveys." 
Appendices include achieved s£unple size and response rates, mean 
scores and means compared with United States, secondary retention 
rates, age distributions and related characteristics of test takers, 
and mean scores and confidence intervals for participating 
educational systems. (KR) 

««im«*i««*««in»#**im *««««**«*«*•**•***««**«**«*«««««**•****••***•**••"* 

• Reproductions supplied by EBRS are the best that can be made 
« from the original document. 



ED 342 680 

AUTHOR 
TITLE 



INSTITUTION 
SPONS ASENCy 

REPORT NO 
PUB DATE 
NOTE 

PUB TYPE 

EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 
ABSTRACT 



NATIONAL CENTER FOR EDUCATION STATISTICS 



00 
CO 



Research and Development Report January 1992 



International Mathematics 
^ and Science Assessments: 

What Have We Learned? 



U 8 DCPAJmHiNT Of EDUCATtOM 

' M.oor c^iange* h»>rff t»ert mad^ !o impxcw 



U.S. Department of Education 

Office of Educational Research and Improvement NCKS 92-01 1 



BEST COPY AVAILABLE 



NATIONAL CENTER FOR EDUCATION STATISTICS 



Research and Development Report January 1992 



International Mathematics 
and Science Assessments: 
What Have We Learned? 



Elliou A. Mcdrich 
Senior Research Assoc ialc 
MPR Associaics, Inc. 



Jeanne E Grifflih 

Associate \Jommissioner for Data Develq)mcni 
NaiicHial Center for Education Statistics 



Department of Education 
Office of Educational Research and Improvement NCES 92-OU 



us. Dettartment of Edi»ation 

Lamar Alexander 
Secretary 

Office of Effcicathmal Research and improveinent 

Diane Rav^ 
Assistant Senary 

National Center for Education Statistics 
Emerson J. EHiott 
Actif^ Commission^- 



National Center for Education Statistics 

"The purpose of the Center shall be to collect, and analyze, 
and disseminate statistics and other data related to 
education in the United States and in other 
nations."— Section 406(b) of the General Education 
ProvisiOTS Act. as amerxled (20 U.S.C. 1221e-1). 



February 1992 



Contact: 

Jeanne E. Griffith 
(202) 219-1395 



Foreword 



As international economic pressures increase demands for a well-educated work 
force, Americans expect more from the Nation's schools. Over the past 25 years a serks of 
international studies has focused attention on how elementary and seconda^ students from 
the United States perform in mathematics and science as compared with students iiom other 
countries. Results fmm the international surveys have been a matter of intense interest and 
debate. On the one hand, they have drawn anention to the ai^arently mediocre perfonnance 
of American students, as well as to curriculum and instructional practices that have raised 
questions about our own. On the other hand, a variety of technical issues concerning the 
nature of the surveys, the comparability of the populations tested, and the quality of the 
data have led to some questions &boux the reality of the findings. 

This report addresses two related issues. First, it summarizes the past international 
studies of mathematics and science, describing each study and its primary results. In plac^ 
of country-by-counlry performance rankings, the report presents the average i^rformance 
for each country accompanied by an estimate of the statistical error circumscribing the limits 
of meaningful country-to-country comparisons. Second, the re{X)rt draws together critical 
and heretofore inaccessible documentation — information that scientists require to evaluate 
the quality of the surveys. For example, information on cross-national differences in 
response rates are presented in every case where these data were available. At the same 
time, the authors point to other nonsampling errors that may affect the data reliability and 
validity as well, but about which we do not have sufficient information to quantify. 

Despite these data-related concerns, the international surveys — which have been done 
at different times and in different ways — come to some similar conclusions. This pattern of 
consistency suggests that the overall results are powerful and cannot be discounted. 
Learning about teaching and learning processes in other countries can lead to enhancwi 
student performance in American sch(X>ls. Only by addressing the data-related problems 
that hamper international studies will the potential for this kind of research be fully reahzed. 
We hope that the insights in this report will continue lo improve the planning and execution 
of future studies. 

NCES, jointly with the National Science Foundation, has be^ " striving in recent 
years lo strengthen the quality and generalizability of international asses-ments. We believe 
that considerable improvements will soon be evident in reports from recent assessments of 
science and mathematics and also of reading literacy. Further improvements are being 
incorporated into the design of a new study of mathematics and science achievement 
scheduled in 1994 and 1998 that the United Slates will use in monitoring progress toward 
achieving the tounh National Education goal, which states that "By the year 2000, U.S. 
students will be first in the world in science and mathematics achievement." 



Emerson J. Elliott 

Acting Commissioner of Education Statistics 



iii 



Acknowledgments 



A number of individuals with significant commitments to enhancing the quality and 
utility of international achievement studies helped organize and execute this report. 

Larry Suter (formerly at the National Center for Education Statistics and now at the 
National Science Foundation) proposed and initiated the project. Before the work began, he 
organized a cme-day conference to define some of the issues tiiat would be addressed in this 
report. Attending the corfcrence were Senta Raizen (The National Center for Improving 
Science Education), Ramsay Selden (Council of Chief State School Officers), Constance 
Sorrentino (U.S. DepMtrnent of Labor, Bureau of Labor Statistics), Harold Stevenson 
(University of Michigan), and several representatives from the National Center for 
Education Statistics. The conference proved invaluable to the authors and considerably 
focused the projrct agenda. 

Several individuals at the National Center for Education Statistics offered 
considerable assistance and guidance as the work progressed. John Ralph and Lois Peak 
provided insightful commerts on the linkages between international issues and questions of 
policy in the United States, and Mary Erase and Sue Ahmed assured that statistical issues 
were appropriately addressed. Dawn Nelson assumed responsibility for managing the 
project in the winter of 1991, and did everything possible to facilitate the work and to 
ensure a high-quality product. 

At MPR Associates, Gary Hoachlander provided the support essential to meeting the 
needs of the project, Riilip Kaufman helped resolve a variety of thorny statistical problems, 
and Christina Chang ably assisted as a research intern. 



iv 



National Center for Education Statistics 
Research and Development Reports 



The Research and Development (R&D) series of reports has been initiated: 

1) To share studies and research that are developmental in nature. The results of such 
studies may be revised as the work continues and additional data become available. 

2) To share results of studies that are, to some extent, on the "cutting-edge" of 
n«thodological developments. Emerging analytical approaches and new computer 
software development often permit new, and sometimes controversial, analysis to 
be done. By participating in "frontier research," we hope to contribute to the 
resolution of issues and improved analysis. 

3) To participate in discussion of emerging issues of interest to educational 
researchers, statisticians, and the Federal statistical community in general. Such 
reports may document workshops and symposiums sponsored by NCES that 
address methodological and analytical issues, may summarize or synthesize a body 
of quantitative research, or may share and discuss issues regarding NCES practice, 
procedures, and standards. 

The common theme in all three goals is that these reports present results or discussion 
that do not reach definitive conclusions at this point in time, either because the data arc 
tentative, the methodology is new and developing, or the topic is one on which there are 
divergent views. Therefore the techniques and inferences made from the data are tentative 
and are subject to revision. To fsicilitate the process of closure on the issues, we invite 
comment, criticism, and alternatives to what wc have done. Such responses should be 
directed to: 

Roger A. Herriot 

Associate Commissioner for Statistical Standards 

and Methodology 
National Center for Education Statistics 
555 New Jersey Avenue, NW 
Washington, DC 20208-5654 



ERIC 



V 

7 



Executive Summary 



The changing world economic order, foreshadowing new demands on the labor force 
and w(»icpU^ highlights the larger intemati(»ial context within which American education 
must be viewed. In January 1990, Fr^ident Bush and the Nation's Governors recognized 
these evolving needs and established a specific goal for mathematics and science 
education — two subject areas critical to successful competition among hi^ly technological 
societies: **By the year 2000, U.S. students will be first in the world in science and 
mathematics achievement** To measure progress toward this objective, there is increasing 
interest in the periodic international assessments of student performaiice in mathematics and 
science. 

Over the past quarter century, there have been five major international studies of 
science and mathematics achievement at the elementary, middle, and secondary school 
levels. The studies have been conducted under the auspices of two different 
nongovernmental research consortia. More than 30 countries have participated in at least 
one of the surveys. The United States has been involved in every one. A great variety of 
findings have resulted from this work, and these studies represent valuable contributions to 
the ways in which schooling inputs and outcomes are understood. The research has 
challenged participating countries to examine the structure, practices, and curricula of their 
educational systems, and as a consequence, to envision the possibility of rethinking 
curriculum content and Uie ways in which stucknts are taught 

This report provides a description of the international assessments and some of their 
findings, and addresses issues surrounding the collection and analysis of these data. 
Further, it offers suggestions about ways in which new data collection standards could 
improve the quality of the surveys and the utility of the reports in the future. 



Three Mathematics Surveys 

• The First International Mathematics Study, conducted in the 1960s, involved 13- 
year-old students from 10 countries and students in their last year of secondary 
school from 10 countries. 

• The Second International Mathematics Study, performed in the early 1980s, 
involved 13-year-old students from 18 countries and students in their last year of 
secondary school from 13 countries. 

• The First International Assessment of Educational Progress, carried out in 1988, 
involved 13-ycar-olds from six countries. 



Three Science Surveys 

• The First International Science Study, conducted between 1966 and 1973, 
involved 1 0-year-old students from 16 countries, 14-ycar-old students from 18 
countri«, and students in their last year of secondary school from 18 countri«. 

• The Second International Science Study, performed between 1983 and 1986, 
involved 10-year-old students from 15 countries, 14-year-old students from 17 
countries, and students in their last year of secondary school from 13 countries. 



vii 



ERIC 



8 



• The First Intemati<mal Assessment of Educational Progress, carried out in 1988, 
involved 13-year-old students from six countries. 

The evidence suggests, in general, that students from the United States have fared 
qmte pooriy on these asscssn»nts. with their scores lagging behind those of students from 
other developed countries. This finding is based largely on analyses of mean achievement 
scores and related rankings of countries pvticipating in each survey. Understanding that 
large-scale surveys pose a variety of analytical constraints and profit when complemented 
by mott intensive case studies of particular findings, the international assessments do not 
explain why students from some countriK perform better than their An»rican counterparts. 
In fact, regular and systematic patterns of differences are absent For example, while 
students from some countries may do better on some or most of the achievement tests than 
students from other countries, the findings arc age-group and subject-matter specific. 
Hence, Aey are very difficult to generalize since they arc not the product of a single set of 
related, overriding school or institutional factors. Even so, across the studies certain trends 
appear to be clear. 

• The vmc students are taught, the more they learn, and the better they perform on 
the tests. There are significant differences in the content of instruction among 
countries at common levels of schooling. 

• Use of a differentiated curriculum based on tracking is negatively associated with 
student performance on the international assessments and also reduces 
opportunities for some students to be exposed to more advanced curriculum. 

• The school affects learning in some subject areas more than in others. 

• Countries committed to keeping students enrolled in secondary school score less 
well on the international surveys, but they spread more knowledge across a larger 
population. Japan is an exception. Even with high retention rates at the secondary 
level, Japanese students perform very well on the mathematics and science 
achievement surveys. 

• Generally the "best students'* in the United States do less well on the international 
surveys when compared with the "best students" from other countries. 

A number of technical considerations inhibit generalizing many other findings. The 
surveys have not achieved high degrees of statistical reliability across the age groups 
sampled and among all of the participating countries. Thus, from a statistical point of view, 
there is considerable uncertainty as to the magnitude of measured differences in 
achievement. Inconsistencies in sample design and sampling procedures, the nature of the 
samples and their outcomes, and other problems have undermined data quality. But despite 
their shortcomings, international achievement surveys now provide valued ways of 
documenting differences and investigating issues in student performance cross-nationally. 
Tlie cMlenge in the future will be to make certain that these surveys meet quality technical 
standards. 

From all indications, the various international testing authorities and consortia are 
moving expeditiously toward improving the quality of Uie surveys and upgrading their 
statistical reliability before the next rounds of international mathematics and science studies. 
Among the important ^ks that lie ahead are strengthening the comparability of samples 
from country to country and developing new ways of reporting international achievement 
scores that will meet a variety of requirements and interests. It is noted that a considerable 
need also exists for small-scale case studies. These studies achieve in depth what they lack 



viii 



ERIC 



9 



in breadth and help researchns understand the circumstances contributing to differences in 
performance among systems of education. 

The report concludes by suggesting that there is a need for more deliberate 
consideration of policy concerns in tM design of international assessments. This, in tum» 
may provide opportunities for policymakers and education practitioners to apply what is 
learned about cross-national differences in achievement to curriculum development and 
programming. 



ERIC 



ix . - 



Table of Contents 



Page 



Foreword Hi 

Acknowtedgments iv 

NatifHial Cent^ for Education Statistics Research and Development Reports v 

Executive Summary vii 

ListofTat>les xiii 

List of Figures xvii 

I. Student Achievement in an International Context 1 

Objectives of This Repcwt 3 

Organization of This Report 4 

II. International Achievement Surveys of Mathematics and Science: 

An Overview 5 

The Comparative Education Research Tradition 5 

Evaluating Data Quality and Defining a Field Outcome Standard 6 

The Five International Studies 7 

The lEA Studies 11 

Tlie First International Mathematics Study (HMS) 12 

The First International Science Study (FISS) 13 

The Second International Mathematics Study (SIMS) 16 

The Second International Science Study (SISS) 1 8 

The lAEP Study 20 

The First International Assessment of Educational Progress (lAEP-I): 

Mathematics and Science 20 

General Perspective on Samples and Sample Quality 2 1 

Summary 22 

in. The International Achievement Studies: Mathematics 

and Science Scores 23 

International Achievement Test Scores: Interpreting tiie Results 25 

Summary 28 

IV. What We Know About the Achievement Scores 

and Country Rankings: A Summary of Selected Results 

and Hypotheses from the International Surveys 29 

Results Reported Across the lEA Studies 30 

Selected Subject and Age Group Results and Hypotheses from Individual 

lEA Studies: Linkages to Achievement 35 

Curriculum, Teaching, Instructional Methods, and Achievement 35 

Student and Family Background Characteristics and Achievement 37 

Organization of Schools and of Instruction and Achievement 38 

Summary 38 



o 

ERIC 



XI 



11 



V. Looking Ahead: Toward Future International Adiievement 



Surveys 39 

Areas of Improvement: Sample Comparability and 

Repotting International Achievement Scores 39 

Sample OnnparabiUty 40 

RepOTting Achievement Scores 42 

A Place for Small-Scale Studies 43 

Evidence of Plogress: lEA Reading Literacy Study, lAEP-II, 

and the Third Internationa] Mathematics and Science Study 44 

Utility of International Studies for the Policy Agenda 45 

Conclusion 46 



Appendices 

A: Achieved Sample Size and Response Rates 47 

B; Mean Scores and Means Compared with United States 65 

C: Secondary School Retention Rates 89 

D: A^ Distributions and Related Characteristics of 

Test Takers 97 

E: Mean Scores and Confidence Intervals for Participating 

Educational Systems 1 1 3 



ERIC 



12 

Xll 



List of Tables 



Table PAge 

II. 1 Number of participating systems known to achieve 85 percent response rate 

at each level of sampling 10 

III. l International achievement test scores summary 26 

ni.2 Number of other participaiing systems scoring significantly higher than 

the United States by age or grade and number of participating systems 27 

Appendix Tables 
Appendix A 

A. 1 Sample size and response rates — schools and students: 

First International Mathematics Study, 13-year-olds 49 

A.2 Sample size and response rates — schools and students: 
First International Mathematics Study, 

Last year of secondary school (mathematics students) 50 

A.3 Sample size and response rates — schools and students: 
First International Mathematics Study, 

Last year of secondary school (non-mathematics students) 51 

A.4 Sample size and resix>nse rates — schools and students: 

Second International Mathematics Study, 1 3-ycar-oIds 52 

A.5 Sample size and response rates — schools and students; 

Second International Mathematics Study, Last year of secondary school 53 

A,6 Sample size and response rates — schools and students: 

First International Science Study, lO-year-oIds 54 

A. 7 Sample size and response rate — students and schools: 

First International Science Study, 14-year-olds 55 

A.8 Sample size and response rates — schools and students: 

First International Science Study, Last year of secondary school 56 

A.9 Sample size and response rates — schools and students: 

Second International Science Study, 10-year-olds 57 

A. 10 Sample size and response rates— schools and students: 

Second International Science Study, 14-year-olds 58 

A. 1 1 Sample size and response rates — schools and students: 

Second International Science Study, Last year of secondary school 

(biology students) 59 

A. 12 Sample size and response rates — schools and students: 

Second International Science Study, Last year of secondary school 

(chemistry students) 60 

A. 13 Sample size and response rates — schools and students: 

Second International Science Study, Last year of secondary school 

(physics students) 61 



ERIC 



xni 



13 



A. 14 Sample size and response rates — schools and students: 

International Assessment of Educational Progress, IB-year-olds 

(mathematics proficiency) 62 

A. 15 Sample size and response rates — schools and students: 

Intemational Assessment of Educational Progress, 13-year-olds 

(science proficiency) 63 

Appendix B 

B. 1 Mean scores and means compared with United States: 

First International Mathematics Study, 13-year-olds (70 items) 67 

B.2 Mean scores and means compared with United States: 

First International Mathematics Study, Last year of secondary school 
(mathematics students — 69 items) 68 

B.3 Mean scores and means compart with Unit^ States: 

First International Mathematics Study, Last year of secondary school 
(non-mathematics students — 58 items) 69 

B.4 Mean scores and means compared with United States: 
Second International Mathematics Study, 1 3-year'-olds 

(46 core items — arithmetic) 70 

B.5 Mean scores and means compared with United States: 
Second International Mathematics Study, 13-year-olds 

(30 core items — algebra) 71 

B.6 Mean scores and means compared with United States: 
Second International Mathematics Study, 13-year-olds 

(39 core items — geometry) 72 

B.7 Mean scores and means compared with United States; 
Second International Mathematics Study, 13-year-olds 

(24 core items — measurement) ., 73 

B.8 Mean scores and means compared with United States: 
Second International Mathematics Study, 1 3-year-olds 

(18 core items — descriptive statistics) 74 

B.9 Mean scores and means compared with United States: 

Second International Mathematics Study, Last year of secondary schr j1 

( 1 7 items — number systems) 75 

B. 10 Mean scores and means compared with United States: 

Second International Mathematics Study, Last year of secondary school 

(26 items— algebra) 76 

B. 1 1 Mean scores and means compart with United States: 

Second International Mathematics Study, Last year of secondary school 

(26items^ — geometry) 77 

B. 1 2 Mean scores and means compared with United States: 

Second International Mathematics Study, Last year of seeondary school 

(46 items — elementary functions and calculus) 78 

B. 1 3 Mean scores and means compared with United States: 

First International Science Study, 10-year-olds (40 core items) 79 



ERIC 



xiv 14 



B. 14 Mean scwes and means compared with United States: 

First International Science Study, 14-year-olds (80 core items) 80 

B.IS hdean scores and means ccmipared with United States: 

First International Science Study, Last year of secondary school 

(60 crae items) 81 

B. 16 Mean scenes and means compared with United States: 

Second International Science Study, 10-year-olds (24 core items) 82 

B. 17 Mean scores and means compared with United States: 

Second International Science Stwly, 14-year-oIds (30 core items) 83 

B.l 8 Mean scores and me^ns compared with United States: 

Secoi^ International Scinice Study, Last year of secondary school 

(30 c(»e items — ^biology) 84 

B. 19 Mean scores and means compared wiUi United States: 

Second International Science Study, Last year of secondaiy %hool 

(30 core items — chemistry) 85 

B .20 Mean scores and means compared with United States : 

Second International Science Study, Last year of secondary school 

(30 core items — physics) 86 

B.21 Mean scores and means compared with United States: 

International Assessment of Educational Progress, 13-year-old$ 

(63 items — ^mathematics proficiency) 87 

B.22 Mean scores and means compared with United States: 

International Assessment of Educational Progress, 13-year-olds 

(60 items — science proficiency) 88 

Appendix D 

D. 1 Mean age and standard deviation of the age distribution of 1 3-year-old sample: 

First International Mathematics Study 99 

D.2 Mean age and standard deviation of Uie age distribution of last-year 
secondaiy sample (mathematics students): 

First Intematitmal Mathematics Study 100 

D.3 Mean age and standard deviation of the age distribution of last-year 
secondly sample (non-mathematics students): 

First International Mathematics Study 101 

D.4 Percentage of 10-year-old sample in different grades: 

First International Science Study 102 

D.5 Percentage of 14-year-old sample in different grades: 

First International Science Study 103 

D.6 Mean age of 1 3-yeBr-old sample: 

Second International Mathematics Study 104 

D.7 Mean age of last-year secondary sample: 

Secomi International Mathematics Study 105 

D.8 Mean age of 10-year-old sample and selected schooling 

characteristics: Second International Science Study 106 



D.9 Mean age of 14-year-oId sample and selected schooling 

characteristics: Second Intonational Science Study 107 

D. 10 Mean age of last-y«ar secc»idary sample and selected schooling 

characterisdcs: Second Intmiauonal Science Study 108 

D. 1 1 Mean age of last-year seccmdary sample (biology) and percentage of 

students in school taking biology: Second International Science Study 109 

D. 1 2 Mean age of last-year secondary sample (chemistry) and percentage 

of students in school taking chemistry: Second International Science Study 11 0 

D. 1 3 Mean age of last-year secondary sample (physics) and percentage 

of students in school taking physics: Second International Science Study 11 1 



ERIC 



. 16 

XVI 



List of Figures 



Figure P&ge 

II. 1 Participants in the international {u:hievement studies 

of mathematics and science. 9 

AppoiUix Figures 
Appoidix C 

C. 1 Estimated percentage of age group enrolled full time in the last year 

of sccond»'7 school: First International Mathematics Study, 1963-64 92 

C.2 Estimated percentage of age group enrolled in last year of secondary school: 

First International Science Study, 1969 93 

C.3 Estimated percentage of age group enrolled in last year of secondary school: 

Second International Mathematics Study. 1980-82 94 

C.4 Estimated percentage of age group in last year of secondary school: 

Second International Science Study, 1983-86 95 

C.5 Estimate percentage of 1 7-year-olds enrolled in school full time or part 

time at the secondary level: 1987-88..., 96 

Appendix E 

E.l Mean scores and confidence intervals for panicipating educational systems: 

First International Mathematics Study, 13-year-olds (70 items) ..115 

E.2 Mean scores and confidence intervals for participating educational systems: 
First International Mathematics Study, Last year of secondary school 
(mathematics students — 69 items) 1 1 6 

E.3 Mean scores and confidence intervals for participating educational systems: 
First International Mathematics Study, Last year of secondary school 
(non-mathematics student — 58 items) 117 

E.4 Mean scores and confutence intervals for participating educational systems: 
Second International Mathematics Study, 13-year-olds (eighth grade) 
(46 core items — arithmetic) 118 

E.5 Mean scores and confidence intervals for participating educational systems: 
Second International Mathematics Study, 13-year-olds (eightii grade) 
(30 core items — algebra) 1 19 

E.6 Mean scores and confidence intervals for participating educational systems: 
Second Intemationai Mathematics Study, 13-year-oIds (eighth grade) 
(39 items — geometiy) 120 

E.7 Mean scores and confidence intervals for participating educational systems: 
Second International Mathematics Study, 13-year-olds (eighth grade) 
(24 core items — measurement) 121 



Q xvii 17 

ERIC 



E.8 ^ ^an scores and confidence intervals for participating educational systems: 
Second International Mathematics Study, 13-year-oIds (eighth giwk) 
(18 core items — descriptive statistics) 122 

E.9 Mean scores and confidence intervals for participating educational systems: 
Second International Mathematics Study. Last year of secondary school 
(17 items — number systems) .'. 123 

E. 10 Mean scores and confidence intervals for participating educational systems: 
Second International Mathematics Study, Last year of secondary school 
(26 items — algebra) 124 

E. 1 1 Mean scores and confidence intervals for participating educational systems: 
Second International Mathematics Study, Last year of secondary school 
(26 items — geometry) 125 

E. 1 2 Mean scores and confidence intervals for participating educational systems: 
Second International Mathematics Study, Last year of secondary school 
(46 items — elementary functions and calculus) 126 

E. 1 3 Mean scores and confidence intervals for participating educational systems: 

First International Science Study, 10-year-olds (40 core items) 127 

E. 1 4 Mean scores and confidence intervals for participating educational systems: 

First International Science Study, 14-year-olds (80 core items) 128 

E. 1 5 Mean scores and confidence intervals for participating educational systems: 
First International Science Study, Last year of seconcfiiy school 
(60 core items) 129 

E. 1 6 Mean scores and confidence intervals for participating educational systems: 

Second International Science Study, lO-ycar-olds (24 core items) 130 

E. 1 7 Mean scores and confidence intervals for particii^ting educational systems: 

Second International Science Study, 14-year-olds {30 core items) 131 

E, 1 8 Mean scores and confidence intervals for participating educational systems: 
Second International Science Study, Last year of secondary school 
(30 core items — biology) 132 

E. 1 9 Mean scores and confidence intervals for participating educational systems: 
Second International Science Study, Last year of secondary school 
(30 core items — chemistry) 133 

E.20 Mean scores and confidence intervals for participating educational systems: 
Second International Science Study, Last year of secondary school 
(30 core items — physics) 134 

E.21 Mean scores and confidence intervals for participating educational systems: 
International Assessment of Educational Progress, 13-year-olds 
(63 items — mathematics proficiency) 135 

E.22 Mean scores and confidence intervals for participating educational systems: 
International Assessment of Educational Progress, 13-year-oIds 
(60 items — science proficiency) 136 



o 

ERIC 



... IS 

xvni 



Chapter I 

Student Achievement in an International Context 



As we enter the last decade of the 20th century, extraordinary changes in the shape of 
the world foreshadow equally imix>rtant changes in the marketplace and in the workplace, 
llie demands on the rising goieration will be formidable. 

Fix educators and education policymakers the implications of these changes have 
been clear for some time. As early as 1983, the National Commission on Excellence in 
Education cast special uigency on the matter of schooling and international competition in 
their landmark report, A Nat on at Risk. 

Our (Hice unchallengol preeminence in commerce, industry, science, 
and technological innovation is being overtaken by competitors 
throughout the world.... V/hat was unimaginable a generation agu . 
has begun to occur — others are matching and surpassing our 
educational attainments.^ 

In January 1990, 7 years after the National Commission's report, President Bush and the 
Nation's Governors highlighted the larger international context within which American 
education must be view^:^ 

Our people must be as knowledgeable, as well trained, as 
comjwtent, and as inventive as those in any other nation. All our 
people, not just a few, must be able to think for a living, adapt to 
changing environments, and to understand the world around them. 
They must understand and accept the responsibilities and obligations 
of citizenship. They must continually learn and develop new skills 
throughout their lives. 

Addressing the intense technology-based environment within which the United States must 
compete, the President and the Governors defined a specific objective in the areas of 
mathematics and science education. They proposed that by the year 2000, U.S. students 
should rank first in the world in science and mathematics achievement. 

Policymakers, business leaders, educators, and citizens all note a perceived link 
between the future for a strong America and a well-educated labor force, capable of 
adjusting to the demands of a society in which technology and information hold the key to 
competitiveness. It is not that the Nation wants our education system to be driven by labor 
markets — for education defi-.es the essence of our democracy and plays a much broader 
role in develojring a responsible citizcniy — but voices from all sectors point to the need for 
linkage. In a study by the Organization for Economic Cooperation and Development 
(OECD), it is emphasized: 



lu.S. Deparunent of Education, Nauonal Commission on Excellence in Education, A Nation at Risk 
(W^hington, DC, 1983), 5. 

2u.S. Department of Education, National Goals for Education O^'a&hington, DC, July 1990). 1 . 

1 

o 19 

ERIC 



. . .our societies are going through a period of rapid and far-reaching 
change. The signs of this are manifold... Technological progress, 
international trade, the speed of communications, world 
competition... these are just some aspects of the change which is 
posing crucial questions for our societies, structures and habits.... 
The analyses undertaken in the OECD, as elsewhere, in order to 
assess the elTect of structural changes on economic performance all 
point to the decisive and fundamental importance of education 
systems. It is they that hold the key to possible progress and that 
determine each country's medium and long-term prosp»;ts in world 
competition.^ 

This is the challenge. And this is one reason why it is so important to understand how 
American youth compare with those of other countries on educational performance, and 
what factors in social, economic, and educational policies and programs are ^sociated with 
different levels of achievement 



Since the early 1960s, cross-national studies of student achievement have become one 
way of evaluating the product of the educational enterprise. While objectives governing the 
design of these studies have been many and varied, as often as not, public attention has 
focused single-mindedly on how students score on the performance tests and how 
countries rank against one another — as though the surveys represented a kind of 
international intellectual Olympics. 

In fact, international studies of student achievement are useful fcr many reasons other 
than performance comparisons. The most important benefit to the United States of 
participating in the international assessments is that understanding is gained of a much 
wider variety of education policies, programs, and practices that can help us improve our 
own educational system. The National Research Council's Board on International 
Comparative Studies in Education (BICSE), which is sponsored by the National Center for 
Education Statistics, the National Science Foundation, and the Department of Defense, 
defines a broad set of objectives: 

...comparative research on education... increases the range of 
experience necessary to improve the measurement of educational 
achievement; it enhances confidence in the geneializability of studies 
that explain the factors important in educational achievement; it 
increases the probability of dissemination of new ideas to improve 
the design or management of schools and classrooms; and it 
increases the research capacity of the United States as well as that of 
other countries. Finally, it provides an opportunity to chronicle 
practices and policies worthy of note in their own right.'* 

While some believe that tiie American values of equality, practicality, and 
individualism combined witii issues of local control of education may limit the possibility 



^Oiiganization for Economic Cooperation and Development, Education and the Economy in a Changing 
Society (Paris: OECD, 1989), 7. 

^Norman M. Bradbum and Dorothy M. Gilford, eds., "A Framework and Principles for International 
Comparative Smdira in Education" (Washington, DC: National Acaleroy Press, 1990), 4. 



ERIC 



of educational borrowing,^ there is clear evidence that all of this is changing. It is now 
undmtood that international achievement studies can influence aiKl help in^jiove education 
policy and programs in the United States and that these surveys represent important 
opportunities to think about and examine many aspects of schooling in America by means 
of comparison. On balance, too many of the most widely publicized summaries of the 
surveys obscure rather than illuminate their meaning, and draw conclusions inappropriate 
to their content and scope. This undermines many serious efforts to examine what these 
studies really say about the skills and capabilities of American students, as compared with 
those from other countries. Moreover, it diminishes efforts to describe what can be learned 
about teaching methods, classroom processes, and curriculum in other countries that might 
enhance schooling outcomes in the United States. 



Objectives of This Report 

By providing a summary of the results of a select group of cross-national surveys, 
this report turns its attention away ftom the newsj^per he^ines. A special effort will be 
made to understand the meaning of and import of the achievement test scores, recognizing 
that this is just one aspect of the research. This synthesis has four objectives: 

• To summarize and describe the international mathematics and science surveys and 
survey samples; 

• To understand what the test ^ores and associated findings do and do not say; 

• To explore some important issues of study design and data presentation that may 
help researchers in preparing for similar studies in the future; and 

• To suggest some strategies for upgrading data quality in future studies. 

Comparative international achievement represents a new set of issues for the National 
Center for Education Statistics, and this report is written to meet several needs. First, 
NCES receives an increasing number of inquiries from Congress, the Executive Branch, 
and others who are interested in various issues addressed by the international achievement 
surveys and want to know more about what these data say. This report should be useful to 
those who require a general overview of these studies. Second, since NCES is now 
sponsoring international assessments,* it is important to ascertain how the data measure up 
to NCES standards for data collection efforts. NCES is now being asked by policymakers 
to stand behind these studies. Can the data upon which the educational performance of 
U.S. students is compared with the performance of students from other countries meet the 
standanJs NCES applies to its own databases before release? This report describes a variety 
of data-related problems that deserve attention so that the quality of future surveys can be 
strengthened and their increasing use in the policy arena can be supported by this agency. 

Despite data-related problems, the past international studies collectively have 
generated important findings and hypotheses in education research. These, too, are 



^B. Bum and C. Hum, "An Analytic Comparison of Education Systems" (paper prepared for the U.S. 
Dq^rtn^t of Educaticm, Naticmal CommisskMi m ExcelleiKe in Etfaicaiicm, 1^3). 
*As this is written, two sets of tests are in im)gress: one is near completioii, the other is planned and 
scteduled The Educational Testing Service will publish results of the 1991 International Assessment of 
Educational Progress (in tte wmter of 1992); and the International Association for the Evaluation of 
Educational Progress will undertake the Third International Mathematics and Science Study in two phases, 
one in 1994, the other in 1998. 



summarized in t!iis report to demonstrate some of the strengths of comparative research. 
The findings stand out beyond the flaws for one or more of the following reasons: 1) they 
are consistent across many studies and test populations; 2) they are identified by analyzing 
relationships within the data that arc Iws subject to the tKrhnical problems identified; 3) they 
are impwtant in that they identified important hypotheses that ^>pcar to be suroorted cross- 
nationally but may need further evaluations; or 4) they corroborate education and social 
theoiy tested in other national studies. 

Thus, this study attempts to objectively present both the technical problems and the 
substantive strengths of these international assessments. 



Organization of This Report 

This report focuses on five studies of science and mathematics at elementary, middle, 
and secondly school levels — curriculum areas that, in the more developed countries at 
least, tend to involve instruction in somewhat similar subject matter covered at about the 
same grade ranges. Constraining the synthesis in this way provides an q)portunity to look 
more closely at two areas of instruction that the Nation has associated with international 
competitive issues and our Nation's capacity to move toward the emerging 21st century 
economy. 

More than 30 countries have participated in one or more of the studies discussed in 
this report. Four grade levels have been tested in at least one subject area (mathematics 
and/or science) at least once. The United States is unique in its commitment to international 
testing. No other country has been involved in as many studies at as nany gra<te levels. 

Chapter I establishes the context for this synthesis. Chapter 11 summarizes the large- 
scale international mathematics and science surveys that have been conducted over the past 
quarter century and explores general issues of data quality. Chaptere III and IV look at the 
achievement scores and some of the key findings of the studies. These chapters should be 
read along with the accompanying appendices that bring together, for the first time in a 
single source, much of the basic data needed to understand and summarize the surveys. 
Chapter V looks ahead, raising some of the data-related issues that could be addressed and 
that might improve future international surveys. With a new round of studies underway, 
this is an appropriate moment to review some of the results of past research, and to look at 
what these studies report and on what basis. 



Chapter II 



International Achievement Surveys of 
Mathematics and Science: An Overview 



International studies of student achievenoent are extraonSinarily complex research 
projects that are difficult to oi^anize, administer, and analyze. To aiqm;iate their strengths 
and weaknesses, they must be understood against the backdrop of the research tradition 
that has defined their objectives and shaped their analytical focus. 



The Comparative Education Researcli Tradition 

Until the late l9SOs most comparative education research was aimed at describing the 
mandate, structure, and support base for schooling within countries — types of schools, 
level and sources of Hscal support, curriculum, teaching methods, enrollments, and so 
forth. Little attention was paid to outcomes, other questions of performance, or student 
achievement 

In 1959 this situation changed dramatically. That year a number of researchers, 
committed to understanding not only the nature of schooling across nations but also the 
quality of the educational product, founded the Council for the International Evaluation of 
Educational Achievement, subsequently known as the International Association for the 
Evaluation of Educational Achievement (IE A). 

Since its inception, the lEA has significantly influenced the direction of comparative 
education research by focusing its attention on the relationship between schooling inputs 
and processes and student performance. T. Neville Postlcthwaitc, one of the lEA's 
founders, described four objectives for comparative studies of this type:"^ 

• Identifying what is happening in different countries that might help improve 
education systems and outcomes, such as philosophy of education, curriculum, 
resources, the organization of schools, teaching methods, and so on; 

• Describing similarities and differences among systems of education and 
interpreting them in terms of educational outcomes; 

• Estimating the relative effects of variables that are thought to be determinants of 
educaticmal outcomes (bo^ within and amcmg systems of education); and 

• Understanding why certain phenomena or practices appear to be important in some 
systems of education but not in others. 

Comparative studies now subsume a large literature that, as Postlethwaite writes, 
""When done well. . .can deepen our understanding of our own education and society. . .can 
be of assistance to policymakers and administrators and. . .can l» a valuable component of 



^T. Neville Postlethwaite, "Preface," ed. T. Neville Postlethwaite, Encyclopedia <4 Comparative Education 
and National Systems cf Education (Oxford: Pergamon, 1988), xvii-xxvi. 



ERIC 



5 

«3 



teacher education programs/'S But despite the variety of stated objectives, among all the 
pnxiucts of comparative education research, cross-national comparisons of student 
achievement have attracted the most attention. Interest in such comparisons is ubiquitous, 
and Americaiis, ever sensitive to issues of performance, are especially concerned with 
"where we stand.** Although there may be many reasons to resist simple compaiisms of 
student achievenent, international studies rest uncomfortably between the world of the 
researcher, cornmitted to using comparative data to enrich the ways of understanding how 
schools wo'.ic, and the world of the policymaker and the educator, who must use student 
outcome data to help decide how to allocate scarce resources among programs and to 
defend the results of funded programs. International achievenent comparisons represent an 
uneasy bridge between these two worids. 

The strength of the international surveys of student achievement, as with other 
surveys, rests on the quality of the study and sample design and its implemcnution. If 
these data arc to represent real performance differences across countries, a necessary but 
not sufHcient condition is that the samples must meet reasonable standards of cross-national 
comparability. From the perspective of policymakers and practitioners, the issue of 
sampling outcomes is far from academic, given the level of interest in the achievement 
scores and the potential bias that can be introduced by selective or nonrepresentative 
samples. 



Evaluating Data Quality and Defining a Field Outcome Standard 

This report attempts to evaluate some very selected technical aspects of the 
international mathematics and science studies with a view toward understanding where 
future improvements arc indicated to support broader policy use of the results. International 
achievement surveys are based on samples; hence, the data arc susceptible to both sampling 
and nonsampling errors. Sampling errors occur because estimates arc based on samples of 
students, not on entire student populations. Nonsampling errors may be caused by many 
factors, among them an inability to obtain complete and correct inframation from and about 
participants and nonparticipants; non-response; mistakes in recording or coding data; and 
errors in collecting, processing, sampling, and estimating missing data. In international 
studies, the special problem of differences in meaning introduced in the translation of test 
instruments into different languages is an important non-sampling issue. Non-sampling 
errors arc difficult to estimate, but they may result in bias and non-reliability of the data 
themselves. Weights were used in each study to account for the sampling design and to 
compensate for non-rcsponse; however, it was not possible to analyze weighting schemes 
and their impact on data in this report. 

Response rates offer important information on the technical quality of each 
international survey sample. The r^ponse rate is the ratio of those who actually i»rticipated 
in a survey compared with those selected to participate in the sample. While there is no 
formal statistical basis for deHning adequacy of response rates, the National Center for 
Education Statistics (NCES) has established its own standards, and these shall be adopted 
for this discussion.^ The NCES standard establishes 



*Ibid., xix. 

^Scc U.S. Department of Education, National Center for Education Statistics, Standards and Folicies, 
March 16, 1987, CES Standard 87-03-04. 



minimum levels for performance in surveys and studies conducted 
by the Center. The levels of data completeness and minimum levels 
of data required for processing procedures and analysis are 
established to ensure that researchers and users will have confidence 
in the quality of the data... .The overall survey target response 
ratc.should be at least 85 percent for cross-sectional surveys. In 
the case where the sample is selected hierarchically (e.g. schools, 
and then teachers within schools) these rates apply to each 
hierarchy.... ^0 

TTie NCES staiuJaid represents an effort to define an a(tequate "field outcome" for purposes 
of evaluating the quality of its own data programs and determining adequacy for release, 
which, in turn, provides one way of (tescribing data quality. While differences of opinion 
exist regarding the definition of an acceptable response rate, the NCES standard is a 
rigorous target. Although the international surveys were neither organized nor funded to 
achieve such high response rates, high levels of non-response may have a significant 
impact on the findings and how ti\ey can be interpreted. In fact, a lower response rate might 
be acceptable if it could be shown that non-response bias was minimal or randomly 
distributed. However, for future studies that NCES is heavily involved in funding, 
adequate response levels will have to be attained for the agency to be able to stand behind 
the results. Therefore the NCES standard represents one way of evaluating the likelihood 
of non-respcMise bias in the absence of any other test. To the extent that data fall short of the 
NCES standard, they may be more likely to be biased because it is not known if the non- 
response is proportionately distributed across the sample target population. Since non- 
lesponse was not analyzed in the technical reports supporting the surveys, the concern here 
about response rates is reasonable and survey results must be viewed with caution. 

Assessing the adequacy of samples also requires examining the extent to which the 
samples meet study design requirements, understanding how countries defined sample 
eligibility, and describing how refusals to participate were handled. These questions 
underlie larger issues of survey design and administration and may be as important sources 
of non-sampling error as are response rates. The data needed to analyze these matters were 
often not available for the international studies, and therefore, in this report they are 
discussed with reference to some studies and not others. The general issue of study design 
requirements and the international achievement surveys will be considered in the 
concluding chapter. 



The Five International Studies 

This report focuses on five studies of mathematics and science achievement that were 
conducted over a 25-year period. They represent a range of test types and organizing 
procedures, and most important, they are arguably the most competently executed, large- 
scale international surveys of their type. Figure II. 1 describes the basic elements of each 
study. As suggested by the figure, the ways in which participating entities defined 
themselves docs not make for simple country-to-country comparisons. In many countries, 
sub-populations administered by autonomous educational authorities participated in the 
surveys independent of one another. In other words, "whole** countries were not always 
sampled. (For example, in some studies a number of Canadian provinces tested separately 
in French and English, as was also the case in French and Flemish Belgium.) Hence, it is 



i<>IbiiL, 15. 

1 ^ Ai^iendix A presents the response rates for the five studies discussed in this report 



w<mh onj^izing that in evsy study there are more educational systems participating than 
comtries participating. These distinctions, in turn, inhibit deriving and comparing naticmal 
estimate. 

The remainder of this chapter describes the surveys, their target populations and 
samples, the survey response rates, the content of each achievement test, and related data 
collection issues. The matnial is drawn from published sources, which presents a special 
problem. Many technical repots and strategic bulletins were produced in conjunction with 
the various studies after the surveys were completed, but most were not made available to 
the larger research community. As a result, while a great deal may be known about the 
samples by individuals directly involved in this research, much of what is required for 
evaluating the quality of the data is not available (e.g., reports describing sample execution 
from country to country are not available or accessible years after the studies were 
completed). Table n.l summarizes the response rates based on the NCES 85 percent 
standard mentioned above. Note particularly how few countries achieve the 85 percent 
goal, and that the United States reaches this level only on one study. 




Figure ILl^wtkdpants in tbe internathiiial achieveiiiait stndtes ui matbeniatics and sdaxe 




Fta Mnhnmrtrt <S963-64) 

Ln-jwr wpopdgy (mubcmnkt) x X 
Lafl-3f«trtc«Entey{naB4B»tb) x 

Second Mrtttiaika (I WVf2) 
Afe!3 

LjHi-3«Bf locontey 



X X 
X X 



X 
X 



X X X X 
X X X X 
X X X X 



X X X X 
X XX 



X 
X 



X 
X 



X 

X 
X 



X 
X 



X 
X 
X 



XXX 
X 



X 
X 
X 



X 
X 



X 
X 
X 



X X 
X 



X 
X 
X 



X X 
X X 



lEA 

RfftSci«c«(}966>73) 
AfPl4 



AfelO 



X 
X 



X 
X 
X 



X X 
X X 
X X 



X 
X 
X 



XXX 
XXX 
X X X X 



X 
X 
X 



X 
X 
X 



X 
X 
X 



X X 
X, X 

x' X 



X X 
X X 

X 



XXX 
XXX 
X X 



X 

X X 
X X 



X 
X 
X 



XXX 
XXX 
X X 



X 
X 
X 



X 
X 
X 



X 
X 
X 



X 
X 
X 



X 

X X 
X 



lAEF 

MMhnmiiiiii jnd Sdm» (19») 



X X X X X X X 



21 



Table II.l— Number of participating systems known to achieve M percent response rate at eacb level of sampling 





Age 


10 


Age 


13 


Age 


14 


Last>year secondary 




loisd 
paitkipatix^ 


Known 
8S%criterim 


Total 


Known 
to achieve 
85%critaion 


Total 
participasing 


Known 
to achieve 
85% criterion 


Toial 
pankipating 


Known 
to achieve 
85% criterion 


First Mathematics Study> 


— 


— 


12 


0 


~ 




12 
10 


0(M^ students) 
0 (Nmi-Math stiufents) 


Second Mathematics Study 














15 




First Science Study 


17 








19 


75 


ly 




Second Science Study 


15 


8^ 






17 


10« 


14 
14 
14 


1 {Bk>logy)9 

1 (Chcmistiy)^^ 

2 (Physics)^* 


lAEP 






12 
12 


10 (Mathcmaiics)^^ 
10(SdCTcc)^3 











— Not sampled. 
SOURCE: See Appendix A. 



2Hlt^ T^^r^In. Tl-iUnd. No dtu. or msuNi^X d«. for 12 .y^em.. Four p«rvi*d d.U ««i iBd no» me« criioi.. H«ce 4 of 20 knoi«, "'^f^TT^ - m 

apZZstr^^^^^^^ No d.u or in«^ci«.t dau .v«UbIe for 7 T«> provided d«. «d did "^l-^f"-- "^"^^ known to «i«.v. .t«d«d. 

H^ZTw^. Sweden. No dau .v«lri4e for 1 .yrten,. Ten provided d«. »d did «H meet critem. Hcn« 6 of 17 known to «^nJ. 

J^. New Zeal^d, Swed«. No dt. .v«lrf.le for 1 «ys««. Eleven pn,vided d«. «k1 did not n.e« ciii«. He»« 7^ 19 known to «h«ve 
6au^, Fr»«e. Hungwy. Sweden. No d«U .v«l.hle for 1 ^«an. Fourteen provided dau md did not meet aiteria. Hence 4 of 19 tocxw^ to .dueve «amtol^ 

^^STl^Tl^^S^l^^ P^K^ T,«U««i. Seven p^vided d«. «d did not .ee. criteria. Hence 10 of 17 known to adneve 

'japMi. No dau avaiUbte for 7 fy«emi. Six provided dau and did not Dwcl criteria. Hence 1 rf 

^Ojapan No dau availabte for 7 syiuaiii. SU provided dau and did not meet criteria. Hence 1 of 14 know^ 

1 Mapan. Poland. No dau avaUaWe for 7 lystema. Five provided dau and did not meet criler* 

J2aU es^ United Kingdom and Canada (New Bi«Bfwick: French). Hence 10 of 12 kno^ 

13aU except United Kingdom and Canada (New Bmniwick: French). Hence 10 of 12 known to achieve nandani 



The lEA Studies 



Four lEA studies dating back to the mid-1960s arc reviewed here, represenUng the 
histonc core of international surveys of student achievement in mathematics and science 
Another lEA mathematics and science study (The Third International Mathematics and 
Science Study) is to be fielded in 1994 and 1998. Other cross-national ffiA research not 
discussed here include studies of reading literacy, reading comprehension, literature, 
French, English, early childhood education, computer use, and civics and classroom' 
teaching practice. 

The lEA holds a unique leadership role in the international testing community. IE A 
was the first entity to develop and administer student achievement tests in more than one 
country. These studies have attempted to expire almost eveiy aspect of the elemcntaiy and 
secondary school curriculum. The surveys have led to important improvements in large- 
scale international sampling methodology, conceptual design, test administration, and data 
analysis. Because the surveys were developed as research projects, typically without clear 
finaicial support, they were consistently un(terfunded and even completing the achievement 
testing process required extraordinary effort and commitment on the part of the IE A 
researchers. The studies were originally designed to support comparative international 
research, and while there was an interest in linkages to policy, the work did not explicitly 
serve the diverse needs of policymakers. Since attention was drawn to the surveys, 
however, in A Nation atRisk,^'^ enormous policy attention has focused on them. 

The lEA is an independent international cooperative, funded through a variety of 
public and nonprofit sources with the participation of education research centers in nearly 
50 developed and developing countries. Organized as a consortium of Ministries of 
Education, university education departments, and research institutes, projects are 
undertaken by international coordinating centers around the world, and are coordinated by 
lEA's small central staff. Most activities are undertaken on a highly decentralized basis with 
modest institutional ovCTsight The agenda of the lEA is to study systems of education from 
an international comparative perspective, focusing on five key issues: 

1 . The curriculum and its effects on education outcomes; 

2 . School and classroom organization and its effects on education outcomes; 

3 . The relationship between achievement and attitudes; 

4. Educational attainment among special populations; and 

5. The relationship between changing demography and changing student 
achievement levels. 

In addition, the lEA provides technical assistance to developing countries attempting to 
imjxDve their educational research capabilities. 

While lEA studies were not originally designed for or intended to be used specifically 
for purposes of ranking student achievement cross-nationally, collecting data from many 



^ Siting t)» woik of Barbara Lerner, A Nation ai Risk described how poorly American students had 
perfmmed on international ^hievement surveys. 

*^As d«cribed in TJJ. Postlethwaite, "Introduction,*' Comparative Education Review 1 (1987), 7-9; and 
T.N. Postlethwaite, "Comparative Educational Achievement Research: Can It Be In^ved?" Comparative 
Education Review 1 (1987), 150-58, 



ERIC 



11 

3i 



educaticHial systems with identical test instruments has inevitably intensified interest in 
comparisons of the relative performance of one nation's students with the students of other 
nations. Perhaps unintentionally, issues of country rank have come to dominate 
discussions of the lEA survey results. Further, given the increasing interest in matters of 
international economic competitiveness in the United States, attention to this aspect of the 
I£A agenda continues to grow. At this point such comparisons are unavoidable, and lEA 
researchers now recognize that comparisons of achievement and country rankings are 
fundamental to their work. However, they continue to promote efforts to better understand 
many other factors affecting student performance. 



The First InternatioruU Mathematics Study (FIMS) 

Purpose. Ccmducted in the mid-1960s, the First International Mathematics Study was 
the lEA's initial attempt to identify factors associated with differences in student 
achievement- "The main objective of the study [was] to investigate the 'outcomes' of 
various school systems by relating as many of the relevant input variables as possible... to 
the output assessed by international test instruments."^^ 

Mathematics was selected as a first area of study by the lEA because it was 
recognized as central to every nation's curriculum. Further, "most of the countries involved 
in the project were ccmcemed with improving their scientific and technical «}ucation, at the 
basis of which lies the learning of mathematics.''^^ lastly, the lEA felt that mathematics 
was a logical first subject area for study because it seemed "less difficult" to achieve 
agreement on the nature of the curriculum appropriate to examine and to develop acceptable 
test instruments in a cross-national setting. 

Participants and survey content. Two age groups were surveyed:^* students at the 
grade level at which the majority of pupils were age 13 (U.S. 8th grade) from 12 
educational systems; and students in the last year of secondary education (U.S. 12th grade) 
from 12 educational systems. At the secondary level, studies were conducted of students 
taking mathematics (from 11 systems) and students not taking mathematics (from 10 
systems). More than 133,000 students, 18,500 teachers and head teachers, and 5,450 
schools in 12 countries participated in the study. 

Thirteen-year-olds were tested in the following areas: basic arithmetic, advanced 
arithmetic, elementary algebra, intermediate algebra, Euclidean geometry, analytic 
geometry, sets, and affme geometry. 

Two tests were derived for the last-year secondary population, one for those studying 
mathematics, and another for those not studying mathematics during the year of testing. 
Both groups were tested in the following areas: basic mathematics, advanced mathematics, 
elementary algebra, intermediate algebra, Euclidean geometry, analytic geometry, 
trigonometric and circular functions, analysis, probability, and logic. Those studying 
mathematics were also tested in calculus. 



^^Torsten Husen, International Study of Achievement in Mathematics: A Comparison of Twelve 
Countries. Vol. I (New Yoilt: Wiley, 1967), 30. 

''^.N, Postlethwaite, "International Association for the Ev^uation of Educational Achievement—The 
Matheiratics Study," Journal for Research in Mathematics Education 2 (197 1): 70. 
^^See Figure III. 

l^For a complete description of each content area, see TJJ. Postlethwaite, 105-7. 



12 



« K -PS instruments consisted of 10 versions of a 1-hour test Each version included a 
fn^J^JS^ * 174 mostly multiple choice itc^l^Z SiT^^^ 

Supplemental quesuonnaires were ctevelopcd to explore student views of teaching oracS 
and instruction in mathematics (22 item?) and elective outcom^(43 SsT Site 
^ ^^^^ administrators examined charaSticfK 

was ^^^'t^Jl^of^'J^'- ^^^P^PPating entity established a center that 
was responsible for denying a samphng procedure in accord with EA guidelines and that 
SSnwSfi ^"^^ international referee. »9 Two- or three-stage stratified probability 
samples were drawn m which schools were first stratified by type, and in some counties 
toca1i3^ administrative area (e.g., U.S. school diTt^ictsTannrSa^^ 

The First Intonational Mathematics Study represents the early lecacv of the TEA 
smvey expeiment. Published material reflects the monumental effortlrequh^d to organize 
and acconq)hsh the research and to develop an analytical model. Howcvct, the details of the 
sample iswedures and execution results are sparse. Data on the sample design were larecly 

ZSJi^x^t^wV^^^ sources, and response's areTn^^ownTe 

Appendix tables A 1-A.3). In addition, descriptions of sample exclusions and the effects 
of exclusions or refusals on the sample are unknown. Husen flags a scriouTproblem 

fjldVtJ ^"u "^'"'^ "^^'"^^ Possibly be discounted. In the 

terminal mathematics group, there were only 222 pupils from France and 146 from Israel 
two of the four countries with the highest means."2 1 

th^ .fuJ^fH^^^^f"^' ?*^"^f ^^^^ calculated in unpubUshed work associated with 
the study (especially individual country reports). However, the FIMS scores and rankincs 
must be read with caution because the field outcomes cannot be examined and tiie quality of 
the data cannot be assessed. v>i 



The First International Science Study (FISS) 

Purpose The First International Science Study was one part of a larger research 

SSJ^oK {?7i^/2L^^ ^"^j^^^ ^"'^^y" conducted by the lEA from 1966 

ttu^ugh 1973 (The six curriculum areas were science, literature, reading comprehension, 
English as a foreign language, French as a foreign language, and civics.) The purpose of 
the Science Study was to assess students' scientific knowledge and to measure their ability 
to understand the nature and methods of science. The lEA had hoped to evaluate science 
curnculum refwin (that is, the effects of innovative science programs) on achievement in 
science (especially the impact of "active learning" related to school science laboratory 
work). However, because it proved difficult to design instruments to evaluate laboratory 



^^Hosen, Achieverwm in Mathematics, Vol. II, 47-50. 
^^Huscn, Achievement in Mathematics, Vol. 1, 40. 
2<>Ibid., Chapter 9. 

2lTorsicn Hasen, International Study of Achievement in Mathematics: A Comparison of Twelve 
Countries, Vol. II (New York: Wiley, 1967), 27. 



13 



skills, most of the analyses focused on understanding the impact of home background, 
school, and attitudinal variables on ojhievcment^^ 

Partkipanss and survey consent. TTirce populations were 
10 (U.S. 5th grade) ftom 17 educational systems; 5ii«ie«is 9ti^^ 
19 systems; Sd students in the last year of secondary education (U.S. 12th g^) fnMn 1 9 
svste^Thcre were 137,000 students and 26,000 teachers ftom 6,900 school 
participating in the first International Science Study. 

The following contcm areas were tested: earth science My 10-year-olds were tested 
in this subject); biology; physics; chemistry; nature and methods of science; and 
iSideS^t^iij science My and last-year secondary students were tested m 

the last two subject areas). 

Students also completed attitudinal surveys. Younger students were asked about their 
interest in science. MiMc- and secondary-level students were asked more 5<>^P^J^f l^^ 
ba^es concerning their interest in science, attitudes toward school science, attitudes 
towMd science in the world, description of science teaching from textbooks, and 
deTcriptionT J^^^ in the laboratory. Teacher and administrator surveys 

explored curriculum coverage and teaching jaactice. 

Across the sampled populations, the tests and surveys varied in design. 

• Tests for the 10-year-olds (two versions that were randomly assigned to test- 
takers) ran for 30 minutes and consisted of 20 items each Most of the items did 
not involve questions specific to science instruction, and 1 1 items overlapped with 
those administered to the 14-ycar-olds. 

. Tests for the 14-year-olds (also in two versions that were randomly assignedio 
tes .^^) ran «) minutes and consisted of 40 itenis. Eleven items overlap^^ 
wiUiT^ administered to the younger population, and 20 with those administered 
to the older population. 

. Tests for those in the last year of secondary school were subj^ !,^c^!^n?l«rh^' 
chemistry, and physics), ran for 60 minutes, and consisted of 40 questions each. 

. Attitudinal surveys included 22 items for the youngest population and 48 for the 
two older groups.^* 

Sample design and field outcomes. As with the First International Mathematics 
StndySv^chl^ciun^ established a national coiter responsible forsampbng and 
iKS refeSapproved each country's sampling plan. Dcpend^npn^^^^ 

If the ^SSS^scho^^ two- or three-stage stratified probability 

siJSiJes we^S^^ffi^^^ not toe funds witii which to monitor sampling programs, 



Countries (New Yoric John Wiley, 1976), 286. 

r«i«. of insuum««. »d «. Comber »«1 Keeves. Science Eiuc.Uon. 

Chapter 2. 



14 



so it is not possible to ctetermine whether all countries adhered to established procedures, 
except insofar as particular nations reported deviations.25 

The First International Science Study offers a relatively complete description of field 
outcomes. With regard to response rates (Appendix tables A.6--A.8). if the NCES response 
rate guidelines were applied to the survey of 10-ycar-olds, 10 of 17 educational systems 
reported response rates below the 85 percent response criterion (1 of these 10 did not 
provide sampling infcmnation), including 2 among the 5 educational systems with the 
highest mean scctes 26 Among the 14-year-olds, 1 1 of 19 educational systems fell short of 
the criterion (1 of these 10 did not provide sampling information), including 1 among the 5 
participating systems with the highest mean scores 27 Among those in the last year of 
secondary education, 14 of 19 systems reporting response rates fell below the NCES 
guideline (1 of these 14 did not provide sampling information), including 3 among the 5 
systems with the highest mean scores.28 In no case did the U.S. samples meet the 
guideline. 

Other aspects of the sample were problematic. Ten- and 14-year-olds were not 
sampled in the same way in every country. Some countries sampled by grade, finding it too 
difficult and too costly to sample by age. As a result, some significant differences existed in 
the construction of individual country samples in terms of the proportion of the target age 
group effectively excluded by grade sampling.29 

A more complicated problem arose in the sample of students in the last year of 
secondary school. Participating systems agreed that only those enrolled in school when the 
survey was administered would be tested and that no attempt would be made to test those 
who, for whatever reason, were not attending school This has precipitated an ongoing 
debate over the import of student retention practices in nelation to the high school samples 
and survey achievement scores. These retention rates varied dramatically from country to 
country at the time of each of the four lEA studies, especially the First International Science 
and Mathematics Studies (see Appendix C). An important aspect of the data in this 
appendix is the shaip increase in retention rates among many countries over the time span 
of these international assessments. 

Documentation on the First Science Study sample affords a clearer picture of the 
sampling process and the difficulties encountered in tiying to establish common sampling 
practices across participating countries; in trying to define a target population in a way that 
enables each country to successfully design and execute comparable samples; and, perhaps 
most important, in ^ing to persuade schools to participate in this type of voluntary testing 
program. 



25waUtcr. Six Subject Survey. 26. 

26&!]gium (Ftemteh), United Sutes. 

^"^tdasl Republk: of Germany. 

2^F«lera! R^blic of Germany, Netherlands, Scotland. 

29Some countries excliuted stuctents who were 1 ot mac years behind in grade for their age (e.g., Chile, 
Hungaiy, and Italy for 10-year-olds and Chile and Hungary for 14-year-oIds); India only sampled the six 
states in which Hindi is the ofTicial language; Israel excluded 14-year-oIds iK>t attending sctool and all 
Arabic-speaking stuctents; Belgium exclude! students at the secondary level attending vocational schools; 
and Holland only san^ted the area around Bangkok. 



15 



The Second International Mathematics Study (SIMS) 

Purpose. As compared with its predecessor, the Second International Mathematics 
Study (SIMS) was a more ambitious and complicated project, reflecting a significant 
amount of learning about the possibilities of large-scale, cross-national achievement 
surveys. Conducted during the 1981-82 school year, the purpose of the project was 

to compare and contrast, in an international context, the varieties of 
curricula, instructional practices and student outcomes, both 
attitudinal and cognitive. By portraying the mathematics program 
and outcomes of each participating system against a cross-national 
backdrop, each system is afford^ an opportunity to understand 
better the relative strengths and shortcomings of its own endeavors 
in mathematics education.^ 

Participants and survey content. Two groups were surveyed:^ ^ students at age 13 
(U.S. 8th grade) from 20 educational systems; and students "who are in the normally 
accepted terminal grade of the secondary education system and who are studying 
mathematics as a substantial part [approximately 5 hours per week] of their academic 
program" from 15 systems. The United States, along with a smaller subsample of 8 
systems, also participated in a longitudinal study designed to assess growth in skills during 
the course of the school year.^^ Jq enable attribution of particular outcomes to teacher 
practices and classroom processes, the longitudinal study pre-tested students early in the 
school year, post-tested them at the end of the school year, and asked teachers to complete 
comprehensive process questionnaires during the year.^^ 

Thirteen-year-olds were tested in five content areas: arithmetic, algebra, geometry, 
measurement, and statistics. Content areas for the last year secondary tests were sets and 
relations, number systems, algebra, geometry, functions and calculus, and probability and 
statistics. 

All 13-year-olds were administered the same 40-item core test and also one of four 
other tests consisting of 34 '^r 35) items. A total of 176 items were available. Students in 
the last year of secondary education were administered two of eight tests of 17 items each 
from a set of 136 items. In both samples, items from the available pool were randomly 
assigned within content areas of each version, and test versions were randomly assigned to 
students. 

In addition to the achievement tests, three other questionnaires were included in the 
cross-sectional survey: 

• Student Background Questionnaire: gathering information about parents (e.g., 
education and occupation) and about the students* attitudes toward mathematics; 



SOoavid Robitaille and Robert Garden, The lEA Study of Maihematics //; Contexts and Outcomes of 
School Mathematics (Oxford: Ptef£amon Press, 1989). 
3^See Figure II.l. 

32panicipating countries in the longitudinal suidy of the younger population were Belgium (Flemish), 
Canada (British Columbia and Ontario), France, Isr^l. New Zealand, Thailand, and tiw United States. In 
addition, Canada (British Columbia and Ontario) and the United States participated in the longitudinal study 
of th? ntdirr pc^latioi. 

33Findings of the longitudinal study are forthcoming in L. Burstein, The lEA Study of Mathematics III 
(Oxford; I^amon Press. 1992). 



16 



• Teacher Questionnaire: gathering information about teaching experience, training, 
qualifications, and altitudes (the longitudinal study also explored instructional 
techniques); and 

• School Questionnaire (completed by the school administrator): concerned with 
student demographics, teaching staff background, the mathematics curriculum, and 
aspects of mathematics instruction. 

Taken together, these supplemental questionnaires were designed to provide an 
enhanced contextual analytical base. 

Sample design and field outcomes. From the standpoint of sample quality, the 
S^ond Intemationsd Mathematics Study has probably received more attention than any of 
the other international surveys. A published report by Robert Garden summarizes the 
samples and sampling procedures in considerable detail,^ discusses a variety of technical 
problems with the data, and identifies gaps in the information available. 

Appendix tables A.4 and A.5 summarize the response rates. For the 13-year-olds, 12 
systems did not provide complete sampling information and 4 others, which did supply 
outcomes, did not meet the NCES 85 percent response rate standard — for a total of 16 of 
the 20 participating systems. In other words, 16 of the 20 participating educational systems 
(including the United States) were either unable to provide response rates at all stages of the 
sampling process, or had a response rate of less than 85 percent at one or more stages.^^ 
For example, if one were to apply the NCES response rate standard to the 13-year-old 
algebra sample, it would raise questions about data from 4 of the 5 systems with the 
highest mean scores.'* Among students in the last year of secondary schools sample, 9 of 
15 systems reported response rates that fell below the standard, or failed to provide 
complete sampling information.^^ Looking at one last-year secondary testing area — number 
systems — 2 of the 5 systems with the highest mean scores did not provide sampling 
information. 38 y §. sample had low response rates to the SIMS, although it was 
evidently better than the previous studies, especially at the school district level (48 percent). 
Bock and Spencer^' argue that actual U.S. response rates for samples for both public and 
private school strata were under 35 percent, when the combined effect of district, school, 
and class response rates are calculated.'^o 

Beyond the issue of response rates, documentation indicated some significant 
deviations from the definitions of the target populations in different countries.** ' From 
country to country, the age of sampled students also varied considerably. Further, the 



^^U.S. Department of Education, Center for Education Statistics, Robert A. Garden. Second lEA 
Maihematics Study Sampling Report {Washington, DC, March 1987). 

'Belgium (Flemish), Belgium (French), Canada (Ontario). Canada (British Columbia). England and Wales. 
Finland, France, Hong Kong, Israel, Japan, New 2^a]and, Nigeria, Scotland, Swaziland, United Statess. 
^^JsqKm, Canada, Belgium (Flemish), France. 

^^Belgium (Flemish), Belgium (French), Canada (British Columbia). Canada (Ontario). Hong Kong, Israel, 
Scotland, Sweden, United States. 
^%oag K(mg, &igland and Wales. 

Darrell Bock and Bruce Spencer, "On Statistical Standards of the Second International Mathematics 
Smdy" (unpublished report, Septemlw 1985). 
^Olbid., 26. 

^^For example, the Netherlands did not include 20 percent of the grade 8 equivalent in the sample for 
various reasons; Nigeria sampled only 8 of 20 states in the country; Hungary included a broader population 



ERIC 



17 

3? 



complicated definition of the secondary-level target resulted in substantial sampling 
inconsistencies across participating entities. (In effect, different countries established 
different tai^gets within the proposed taiget description.) 



The Second International Science Study (SfSS) 

Purpose. Like the Second Mathematics Study, the Second International Science 
Study (SiSS) was an effort to build on successful aspects of earlier work as well as to 
expand the scope of the research. The objectives of the study were to describe science 
education in the participating countries; to examine between-country achievement 
differences and, where possible, to explain their sources; to attempt to explain differences 
in achievement between students within countries;^^ and to examine changes in 
achievement (xitcomes between the two ^ence studies.^^ The study was intended to derive 
results that could be reliably compared across countries, but there was also a strong 
commitment to collecting information that would help analysts who were particularly 
interested in the status of the science curriculum within countries. 

Participants and si^ey content. Three populations were surveyed:^ 1 0-year-olds 
(U.S. grade 5) from 15 educational systems; 14-year-olds (U.S. grade 9) from 17 systems; 
and stiuients in the last year of secondary education (U.S. grade 12) from 14 systems. The 
eldest population was divided into four subgroup for testing purposes: those studying 
biology, those studying chemistry, those studying physics, and those not studying a 
science subject during the test year. Across all educational systems, a total of 260,830 
students, 22,612 teachers, and 9,578 schools participated in the study. 

For the 1 0-year-olds, the achievement tests consisted of 24 core items administered to 
all students, and four tests of 8 items randomly assigned among those taking the test The 
achievement test for the 14-year-olds included 30 core items and four groups of 10 items 
each randomly assigned. For those in the last year of secondary school, there were 
specialized tests involving 39 items in biology, 39 in chemistry, and 38 in physics. A high 
proportion of items were lakcn from the First Science Study. 

The Second Science Study included five instruments in addition to the achievement 

tests: 

• Student (Questionnaire: gathering basic information including sex, age, grade level, 
family background, time spent in class on science, and time spent on science 
homewoik; 

• Attitude Questionnaire and Other Scales: measuring students' perception of science 
teaching, and verbal and quantitative skills; 

• Process Exercises: an optional instrument measuring students' ability to handle 
science equipment, design experiments, and make observations; 



for the terminal year of high school than was called for by the definition; and Scotland sampled two grades, 
either of whk^ couM be consictered the t^minal year of secondary %1kx>1. 

^2t. Neville Postlethwaiie, Second Jmernational Science Study. Vol. II Draft (Hamburg, July 1990), 11. 
^^Malcolm J. Rosier, "The Second International Science Study," Comparative Education Review 31(1) 
(1987): 107. 
^See Figure II.l. 



18 



• Teacher Questionnaire (given to those who taught science to the students in the 
samjrfe): to obtain information on teachers' qualifications and to rate orocHluniiy to 
team; and 

• School Questionnaire (completed by school principals): profiling student 
ctemographics and teaching staff background. 

Sample design and field outcomes. As with the other lEA surveys, depending on the 
size of the target population, two- or three-stage probability samples were drawn. 
Appendix tables A.9-A.13 summarize the sample response rates. All paiticipatinc systems 
provided data. 

For the 10-year-olds, 7 of the 15 participating educational systems failed to achieve 
85 percent response rates at each sta^ of the sampling process, including 1 system among 
the 5 with the highest mean scores.'** Among the 14-year-olds, 7 of the 17 systems failed 
to achieve the response rate guideline, including 1 system among the 3 with the highest 
mean scores.^ The U.S. sample did not achieve the NCES response rate guideline. The 
last-year secondary samples were extremely complex to draw— involving specialized 
cunicula with little information available as to the number of students or classes making up 
the targets. About one-half of the countries were unable to provide complete information on 
each step of the sampling process. Furthermore, samples at the last-year secondary level 
became very small, and in some cases response rates were exceptionally low. Some 
countries sampled selected intact classes, and some selected students within classes. Using 
tiie biology test as an example, only 7 of the 14 participating educational systems even 
provided complete information on tiie sample, and of these only 1 met the response rate 
standard, thus including only 1 system among the 5 with the highest mean scores. In 
general, the U.S. sample sizes — of both schools and students — were very small and did 
not achieve 85 percent response rates. 

As reported by Postlethwaite,^^ exclusions were also significant. Less developed 
countries had very high levels of exclusion, often reflecting the small proportion of 
children past elementary age who were still enrolled in school. Other countries excluded 
small schools or tested only in the national language, which were factors likely to influence 
mean score performance. At the secondary level, enrollment in school, in science, and in 
certain science subjects varied dramatically from country to country, making it virtually 
impossible to ascertain comparability of targets or samples. 

From the inception of the survey process, the construction of the U.S. sample was 
problematic. Sampling lists were available only 6 montiis before testing. Since time was 
short, the decision was made not to follow a replacement strategy of drawing parallel sets 
of schools. Instead, a group of schools was selected that was twice the designed sample 
size. In other words, since the plan called for a sample of 125 schools, in order to assure 
an adequate sample size, 250 schools were asked to participate. Such an approach, while 
safeguarding the final size of the sample, does not reduce the problems introduced by 
selective nonresponse. 

The last-year secondary response rates during the first year of testing were veiy low, 
and tiie following year another sample of schools was drawn and tested. Analysis of the 
U.S. data, however, showed that when items that were on both the first and second 



^^Sweden. 
Canada (English). 

^^T. Neville PosUethwaite, The Second International Science Study, Vol. II Draft (Hamburg, 1990). 



19 



39 



international science study tests were compareft second science students significantly 
outperformed first science students. This seemed questionable since the results of the 
National Assessment of Educational Progress showed no comparable improvement in 
perfonnance over the same time interval (roughly 1970 through 1983-84). The conclusion 
was that in the United States the Second International Science Study sample had 
undenepiesented schools in which there would be lai^ger proportions of "pow" pofomers. 
An entirely new data collection effort was mounted three years later, based on a completely 
new sample drawn in 1986. The objective was to correct for the underrepresenied 
populations. This "phase two" sample became the official U.S. data set. All the reported 
scores were based on the phase two test xesults. The fact that the U.S. data were collected 
on two occasions raises questions about their utility. 



The lAEP Study 



The First Intermuional Assessment of Educational Progress (lAEP-l): 
Mathematics and Science 

Purpose. The International Assessment of Educational Progress (lAEP-I) is related to 
another research program — the National Assessment of Educational Progress (NAEP)» 
which has been conducted in the United Slates periodically since 1969. The initial lAEP, 
administered in February 1988. was designed to be exploratory in nature (although the 
results are often discussed as though they were definitive).^ The I AEP had two objectives: 
to examine the feasibility of reducing the time and money spent on international 
comparative studies by capitalizing on design, materials, and procedures enveloped for the 
U.S. NAEP; and to permit interested countries to experiment with NAEP technologies to 
see whether or not they were appropriate for local evaluation projects.^' Within this 
framework, the Educational Testing Service argued that the study should be used to 
"provide teachers, school administrators, policymakers, and taxpayers with information 
that helps to define the characteristics of successful student performance and suggests areas 
for possible imprDvement and change."^ 

Participants and survey content. Six countries (12 educational systems) participated in 
the study .51 The target population was defined as all students bom during the calendar year 
1974 — that is, students ranging in age from 13 years, 1 month to 14 years, 1 month at the 
time of testing. 

The tests were organized around the following topics: 

Mathematics: numbers and operations, relations and functions, geometry, 
measurement, data organization, and logic and problem solving. 



"♦^A second lAEP study was conducted in the fall 1990 and winter 1991. Findings arc to be publist^ in 
cariy 1991 lAEP-II tested mathematics, science and geography |m^ici»icy ammg 9- and l3-year-<^ds. For 
the 9-yest-cAds, 18 systems panicipaied in the mathematics and science asscssmoit For the 13-ycar-okls, 30 
systrans pank;i]Kiied in the matl^^naiics assessncnt, 29 in the science ass»sment, ^ 17 in geography. 
'^'Benjamin F. King, A World of Differences: Technical Report. Part I (Princeton: Educational Testing 
Service, 1989), 2. 

50Archie E. Lapoinlc, Nancy A. Mead, and Gary W. Phillips. A World of Differences (Princeton: 
Educatmnal Testing Service, 1989), 7, 
5JScc Figure II.l. 



ERIC 



Science: life science, physics, chemistry, earth and space science, and nature of 
science. 

Test items were drawn from the 1986 NAEP. There were 63 mathematics questions 
selected from a pool of 281 questions and 60 science questions chosen from a pool of 188. 
All science questions were multiple choice, and 14 of the mathematics questions were 
open-ended. Each test was 45 minutes in length. 

Score comparisons were made on the basis of scales representing levels of 
proficiency, set to a mean of 500 and a standard deviation of 100. Hence, the study was 
designed to measure relative levels of competency, in contrast with the lEA research, which 
did not pn^x)se any proficiency measurement scales.. 

Sample design and field outcomes. The sampling plan called for a multi-stage cluster 
of 50 pairs of schools, a total of 100 schools, and a sample of about 20 students per 
school, or about 2,000 students per country. (Small schools were combintd v/ith adjacent 
schools to create "superschools" for sampling purposes.) The general sampling strategy 
involved two or three stages of selection, with a total among all countries of 24,000 
students participating in the study. 

Ten of tl» 12 participating systems achieved 85 percent response rates at each stage in 
both mathematics and science (Appendix tables A. 14 and A. 15). 



General Perspective on Samples and Sampte Quality 

As described in this chapter, representative sampling on past international 
achievement surveys has been an elusive goal. Cursoiy review of field outcomes, using 
information published in conjunction with each of the four lEA studies, for instance, 
suggests that there have been significant deviations from sampling plans and real 
shortcomings in field execution. Utilizing the NCES response rate guidelines, U.S. data 
would be excluded from every lEA study at each grade level. The guideline should not be 
view«i as unreasonable, however. With care in administration and ad»]uate resources, it is 
achieved regularly on a variety of voluntary, large-scale surveys in the United States. The 
International Assessment of Educational Progress, in contrast, achieved higher quality field 
outcomes than the lEA, but the samples were small and few countries participated in the 
study. 

Four conclusions arc inescapable: 

1 . Few educational systems participating in the lEA studies achieved response rates 
approaching the NCES guideline. Since studies of non-response were not 
published (and little research on this matter was conducted), the impact of non- 
response on the survey results represents a significant concern. 

2. It is not clear that comparable populations have been tested across participating 
countries. 

3. From study to study, country to country, and age group to age group, there is 
considerable variability in sample quality. As yet, no smgle standard has been 
establishtti as a basis against which samples are assessed before data analysis. 
Some of the variation in quality has to do with the execution of the sampling 
process, and some is a result of differences in the basic character of the target 
populations, particularly at the secondary level. 



21 



4. In many cases, san^>le sizes were very small This should have influenced the 
design of the analyses and the results reported. 

Many countries, including the United States, have had real difficulty achieving high 
response rates, thereby raising questions about sample representativeness. Until such 
problems arc resolved, interpreting results of the international achievement surveys, the 
subject of the next chapter, requires caution. 



Summary 

The five studies described in this chapter are the core group of international 
achievement surveys of mathematics and science. Their objectives and scope set them in 
sharp contrast to small-scale studies, or case studies of selected peculations or particular 
conmiunities. Taken together, they represent a significant effort to develop ways of 
measuring and comparing the determinants of educational outcomes ^d the p«formance of 
educational systems, using modem survey and data processing techniques. Given 

?[uestions of data quality rai^ in this chapter, key results of these studies, discussed in the 
ollowing two chapters, should be viewed cautiously because they are more likely 
indicative of achievenwnt-related trends and patterns, rather than definitive and conclusive. 



22 



Chapter m 



The International Achievement Studies: 
Mathematics and Science Scores 



The achievement studies described in Chapter II are undertakings of unusual 
complexity and scope (some surveys involving more than 100,000 students) and a test of 
the methodological capabilities of even the most s<^histicatcd researchers. Over a period of 
25 years, extraordinary talent, considerable time, and substantial resources have been 
brought to bear on this relatively new field of study. But the data have proven difficult to 
analyze and still harder to interpret. Taken together, the body of research is so laige that it 
is haxdly amenable to a brief overview of resulte. In fact, this repments both a strength and 
a weakness of the international achievement surveys. 

Despite the technical issues cited in the preceding chapter, the international surveys do 
help determine "where we stand" in mathematics and science achievement— that is, the 
performance of American students as compared with students from other countries. The 
studies also suggest some of the possible reasons why these differences in performance 
occur. The studies are useful because of the consistency of many of their findings and the 
internal relationships identified, and because they frequently corroborate education and 
social theory. The focus of this chapter is on where we stand, while the next chapter 
discusses other key results. 

Many of the results are study-specific (i.e., not corroborated for the same subject area 
in other studies and substantiated by another study's findings only occasion^ly). Even 
within a single study, findings for one population may be unique. As discussed in the next 
chapter, there arc many reasons why this may be so; nevertheless, this fact constrains the 
way in which the material is reviewed here. 

Beyond recognizing that the same educational systems did not participate in each 
survey, and that the sample targets and survey objectives distinguish these studies from 
each other, several caveats — ones that dictate against simple interpretations of the 
mathematics and science scores— should be mentioned. 

• Participating educational systems were self-selected. International achievement 
studies do not offer comparisons of students from the same educational systems or 
comparably aged students from survey to survey. Since participation in each study 
was voluntary, the reported rankings do not represent the U.S. standing among all 
nations of the worid or even among all developed nations, but only among those 
who chose to participate in each study. 

• Sample quality has much to do with the level of confidence one can have in the 
scores reported. As described in Chapter 11, much of the data is technically 
problematic; hence, the scores must be viewed with caution. 

• As noted in the preceding chapter, there is no consistency in how the sampled 
populations were defined. Different studies tested students of different ages, and 
participating educational systems did not consistently apply uniform sampling 



ERIC 



23 

43 



criteria. It is not always possible to characterize performance of "an age group" or 
"a grade lever across (or even within) studies ^2 

• Educational researchers, curriculum specialists, and psychometricians have 
devoted extracMdinaiy effort to developing instnmffints that could be used in every 
country paiicipating in the international assessments. Countries are not scored 
against their own curriculum, and the scores are not adjusted based on differences 
in curriculum. The approach that has been agreed upon has the advantage of 
comparing performance against a common standard, derived through consensus 
and not bound by national curricular differences. But this procedure raises some 
questions. Even a cursory review of lEA national committee reports^^ indicates 
that in each countiy there are some categories of items tested that arc not taught at 
all; some that are of low jaiority; and some that arc entirely outside the instructional 
objectives for a particular age or grade group. It would have been helpful to 
polic3rmakers if mean performance scores had also been measured and reported 
against national curriculum (called the "intended" curriculum by the lEA). This 
ap]nx>ach would document how each country's results measure against its own 
instructional objectives. Presentation of results in these two ways would have 
answered the dual issues of: 1) how well do students perform (clearly affected by 
the differences in the curriculum); and 2) how well do students learn what they 
have been taught. 

• The international testing community has devoted considerable attention to 
ascertaining curriculum differences (the "implemented" curriculum by the IE A) 
among countries participating in the achievement studies. A persistent problem, 
however, is how to account for mese differences in the reporting of test scores. 
Kenneth Travers, for example, discusses the extent to which items on the Second 
Mathematics Study are reportedly taught in each participating country.54 For the 
13-year-olds, on a topic-by-topic basis,55 "opportunity to leam"56 for items on 
each test ranged from 31 percent of the tested items in some countries to 95 percent 
in others (the U.S. range was 44 percent for items tested in geometry to 87 percent 
for arithmetic). For the last-year secondary sample,^^ "opportunity to learn" 
ranged from 29 percent of items in some countries to 100 percent in others (the 
U.S. range was 46 percent of items on the probability and statistics test to 88 
percent on the algebra test). The Travers findings signal a critical issue. The theory 
of opportunity to learn is a major contribution of the lEA research, but it is not 
taken into account in the summarized presentations of mean scores and country 



^^For an eflfOTt in this direction, see John Keeves. ed., The IE A Study of Science III: Changes in Science 

Educeaion and Achievement 1970 to 1984 (Oxford: Pergamon Press, 1991). 

5^See Chapter 2 for a ctescaipiion of the procedures used by the lEA to define test c(MitenL 

^^Kenneth Travers, "The Second International Mathematics Study: Overview of Major Findings," 

(unpublished paper, Urbana-Champaign: University of Illinois, 1986). All the other IE A studies, as well as 

tte lAEP, graj^te with tftt problem to one ^ree or another. 

55lbid., 36. 

^Oi^XHtunity to l^m" is an issue of considerable concern to those who attenq» K> develop ^htevement 
tests fat cnKS-nmi<ma] studio. The c(»icq)t means attempting to recognize differences betweoi test content 
and curriculum (especially difficult to estimate for the United States, which has ik> centralized education 
auth<»ity). In die lEA studies eveiy question on each achieventtnt test is evaluated by a samj^e of teachm, 
who ffle asked to rate the pn^ability with which students taking the test will have been taught the matoial 
necessary K) answer the question correctly. It is assumed that teachers are in a positim to know. (In fa:t, 
teacbm in one grade might not know with assurance the substance of coursework from other grades.) 
^^Ibid., 45. 



ERIC 



4*i 



rankings. To do so could possibly affect country rankings and provide a 
counterpoint to measured mean scores. 

These notes, along with the sampling issues discussed in Chapter II. are essential to 
the overview of test results that follows. The mean scores must be viewed with caution. 
While the scores offer a general perspective on the perfOTmancc of American students in 
comparison with students from other countries on mathematics and science achievement* 
they must be carefully qualified. 

International Achievement Test Scores: Interpreting the Results 

While researchers have argued that the international achievement surveys are not 
designed to be an academic Olympics, the general public has been exposed to little more 
than the test scores and cwintiy rankings, le»ling to an inevitable overinterpretation of their 
meaning and import. Although the test results may be viewed as a kind of indicator, this 
cannot be done responsibly without also understanding the methods used to collect and 
report the data and the degree to which samples of students either represent or fail to 
represent sampled cohcms. 

Appendices B and E summarize the achievement scores survey by survey. Scores are 
reported in two ways. Appendix B shows, in tabular form, the measured means and 
comparisons of the mean scores of each participating educational system with the U.S. 
score. Appendix E calculates the confidence intervals^* for all counoies and graphically 
shows comparisons with the United States.'* While the confidence intervals lack the 
precision implied by the means, they represent a reasonable reporting framework since they 
presume no greater accuracy than the data permit. The groupings in Appendix B show 
those participating educational systems whose mean scores are higher, lower, or within the 
same range as the U.S. score. These comparisons arc summarized in tables and III.2. 

For those unfamiliar with the statistical issues underiying the presentation of these 
tables, a brief note may be helpful. Because sampling techniques are used, it is not always 
possible to say whether the actual mean scores for some educational systems differ 
statistically from those of the United States. Even though the measured mean of one 
country may be higher or lower than that of the United States, the difference may not be 
statistically significant. As a result, the rank ordering of the United States could be different 
from that which is suggested by the measured mean score. Thus, if there arc several 
countries whose measured scores are not significantly different statistically, this suggests 
that the sample size was not large enough to know for certain that the actual scores are 
different; any one country might actually have the highest or lowest score. For example, in 
looking at Appendix B, table 4 (from the Second Mathematics Study), the measured scores 
of seven other countries are not significantly different statistically from the United States, 
when the U.S. mean score is compared with the scores of other countries. While the 
measured mean score suggests that the United Slates "ranks 10th." statistically speaking, 



'^Confidence intervals are csUmaicd by mean ±1.96 x SE, except for Ihc Firsi Mmhcmalics Study as nmcd 
in A^jendix B. Standard errors arc drawn from the study reports themselves. Methods of calculation were 
nol^waysrepcHied. 

'^Using BtMiferroni adjustments, counuies were compared with the United States, and scores were 
categ(vized as higher, the sanw, lower than the United Stat^, based on the ksuILh of i-lc«ts ai a 5 percent 
aignificance level. In Appaidix E, in general terms, based on sampling error estimates, 95 percent of the 
time this range will incluA; the actual cwintiy mean score between tlw upper and lower end of the range 
defined in the figures. Exactly where the actual mean score for the population falls in the range is not 
known, although the measured mean for the sample is shown. 



ERIC 



25 

43 



Table III.l — ^Internatiooai achievement test scores sammary 











Number of 


Mnmbcr of 


Number of 




Number of 


Nnnber 


U.S. rank 


participants 


participants 


participants 




participating 


of 


by 


significantly 


not significantly 


significantly 




educational 


participating 


measured 


higher 


different 


lower 




vet ^ m fi ^ 
a J 9 »ciuai 




uieiia aKurc 


(nan 


fpnm IT C 

ironi tj>9> 


inan 
















Afffi l^^-CTorfi test 


i ^ 


12 


1 1 
1 1 


0 


i 


1 




I 4m 


\ \ 


1 *» 


1 1 

1 1 


n 
u 


A 
V 


Last-year secondary-Non-Math students 


10 


10 


10 




0 


0 


SecoBd Mathcmatlca Studv (lEA) 














Age 1 3^Arithnietic 


20 


18 


10 


5 


7 


7 


Age 13— Algebra 


20 


18 


12 


7 


ft 


*t 


Age 13-Ccoinctry 


20 


18 


16 


10 


5 


3 


Age 13-Measurement 


20 


18 


18 


17 


0 


2 


Age i3-5tattsiics 


20 


1 0 


8 


2 


12 


5 




15 

i *7 




12 








Last-year secotKiaxy-Algebra 


15 


13 


14 


11 


3 


0 


Last-year secondary-Geometry 


15 


13 


12 


10 


4 


0 


Last-year secondary-Calculus 


15 


13 


12 


10 


3 


1 


First ScUntt Study (lEA) 


174 












Age lO-^ore test 


16 


4 


1 


5 


5 


Age 14-Core test 


194 


18 


7 


5 


5 


3 


Last-year secondary-Core test 


194 


18 


14 


10 


3 


0 


Second Science Studv flEA) 














Age 10-Core test 


15 


15 


8 


5 


4 


5 


Age 14-Core test 


17 


17 


14 


10 


5 


1 


Last-year s^^ndary-Biology 


143 


13 


14 


123 


0 


0 


Last-year secondary^-Chemistry 


143 


13 


12 


93 


2 


1 


Last-year seomdary-Physics 


142 


13 


10 


72 


1 


3 


International Assessment (lAEP) 














Age 13-Mathematics 


12 


6 


12 


10 


1 


0 


Age 13-Scicnce 


12 


6 


9 


8 


3 


0 



'In some countries more than one proviiKe partkipated^ or nwre than oiw language group in the saine country particl{»t^ as a separate luting entity. 
^Data not available for 2 partktpating «lucatkmal systems. See Af^sendix B« E. 

-^Elau not availabte for 1 participating edt^atkmal system* See A]q)emlix B, £» ^ 
^I^ta not available for S panicip^ing educaliorui] systems. See A]^»Klix B, E« ^ / 



SOURCE; See Appendix B. 



Table III«2 — NDmber of other participating systems scoring signiHcantly higiier than the United States by age or grade and 
number of participating systems 





Age 10 


Age 13 


Age 14 


Last-year secondary 


First Math Study 




9 of 11 (Core test) 




11 of 11 (Math students) 
7 of 9 (N(»i<nutth su^enis)^ 


Second Math Study 




5 of 19 (Arithmetic) 
7of 19(Al^hra) 
lOof 19(Geomeijy) 
17 of 19 (Measurement) 
2 of 19 (Statistics) 




9 of 14 (Number systans) 
11 of 14 (Algel^) 
10 of 14 (Geometry) 
10 of 14 (Calculus) 


First Science Study 


1 of 1 1 (Core tcst)2 




5 of 13 (Core test) 


10 of 13 (Core test) 


Second Science Study 


5 of 14 (Core test) 




10 of 16 (Core test) 


12 of 12 (Biology)^ 
9 of 12 (Chemistry)^ 
7 of 1 1 (Physics)^ 


lAEP 




10 of 11 (Mathematics) 
8 of 11 (Science) 







^ Data not available fax 2 acklitkHial i^rtici|^ng educatKMial systems. 
^Data not available for S addiUraal i^cip^ng ediK:^nal systems. 
^Data not available fc^ 1 actional participatii^ educaticMial system. 



SOURCE: See Appendix 



the real picture is less definitive. The United States could actually rank anywhere from 6th 
to 13th. 

In examining each study's results, note again that the surveys did not test the same 
subject matter in the same way from one study to another. This precludes representing 
trends beyond the very general. Nor were the same age and grade levels tested from study 
to study, further inhibiting acxoss-study comparisons. 

Among countries participating in the international studies reviewed here (the United 
States is the only countiy to have participated in them all), there is considerable movement 
in mean score and rank within age/grades and between subjects. Japan is a rare exception, 
ranking at or very near the top in almost every test In some cases, U.S. performance is 
clearly low relative to that of other ^ucational systems, but it is sometimes near the top or 
in the middle relative to other participants (Appendices B and E). 



Summary 

This brief description of results on the international mathematics and science surveys 
is not intended to obscure the general point that students from the United States have not 
performed very well on any international achievement study. At the same time, the reality is 
somewhat l«s clear than the picture that has been conveywi in the media. Generally, across 
the surveys, younger American students seem to perform better, relative to their 
international peers, than those enrolled in the last year of secondary school. Even here, 
however, using caution is essential, because the secondary school populations upon which 
the survey samples are based differ dramatically f?x>m country to country. 

The next chapter describes findings associated with achievement that hold across the 
international surveys and also identifies other findings linked to achievement that are unique 
to a single subject area, age group, or study. 



ERIC 



28 



Chapter IV 



What We Know About the Achievement Scores and Country 
Rankings: A Summary of Selected Results and 
Hypotheses from the International Surveys 



The five matiiematics and science studies have involved students from many countries 
at several grade levels. Taken together, these surveys offer an important perspective on 
differences in achievement across educational systems. 

There is one consistent message. Students from the United States, regardless of grade 
level, generally lag behind many of their counterparts from other developed countries in 
both mathematics and science achievement. That, perhaps, is the only consistent message. 
But caution is necessaiy. Chapter II identified a variety of technical problems that raise 
questions about the achievement survey data and make it difficult to know the degree to 
which sampling and non-sampling eirors may bias the results reported. The discussion in 
the preceding chapters, the standard error tables in Appends B, and the achievement score- 
related ccmfidence intervals in Appendix E all demonstrate how problematic it is to attempt 
country achievement score ranking comparisons. However, the consistency of the results 
across studies and populations suggests that there is an important underlying theme of 
lagging U.S. performance. 

Although a number of hypotheses have been offered, international surveys have been 
far less successful at explaining why particular groups of students achieve as they do in 
comparison with students of the same age or comparable grade level from other 
countries.*^ These studies have not led to consistent conclusions as to why students from 
other countries perform better academically than their American counterparts, and there are 
few powerful correlates associated with the overall pattern of achievement across the 
populations participating in the international surveys. 

Despite the technical flaws of the international studies, this chapter examines a 
number of explanatory issues that have contributed uniquely to our understanding of 
comparative achievement results. The findings seem to supersede the technical flaws for 
several reasons. First, as noted above, is the consistency of some fmdings across studies 
and age groups with different shortcomings. Second, some of the findings are based on 
internal relationships identified in the data that are less affected by sampling issues. Third, 
some of the findings are important because they identify important hypotheses that appear 
to be supported cross-nationally but may need further exploration. Finally, some of the 
findings corroborate education and social theory that has been developed based on national 
studies, thus supporting the basis for these inferences. 

In fact, the international data inform a variety of issues, which are not specifically 
related to the achievement scores and country rankings. To that end, this chapter pursues 
two lines of inquiry: 



^See, for example: Curtis McKnight, F. Joe Crosswhite, John A. Do.ssey, Edward Kifer, Jane O. 
Swafford, Kenneth J, Travers, and Thomas J. Cooney, The Underachieving Curriculum: Assessing U.S. 
Mathematics from an International Perspective (Champaign: Stipes, 1989); and John Keeves, ed., The lEA 
Study {^Science III: Changes in Science Education and Achievement J 970-84 {Oxford: Pergamon Press, 
1991). 



ERIC 



29 



• The first looks across the studies and asks, "What results do the surveys report in 
common?" At this general level, to the extent that there are any commonalities, they 
mostly describe differences in the way schools are organized and differences in 
national education policy and objectives. 

• The second explores subject- and age-specific results and hypotheses. Here 
mathematics and science studies are discussed separately. A certain number of 
cross-national, cross-test hypotheses regarding the correlates of achievement 
emerge, although most arc subject- and grade-level specific. 

Dividing the chapter in this manner provides a broader perspective on the international 
achievement surveys as a body of woik, and it raises important tensions in the literature. At 
the level of subject and grade, there are innumerable interesting, and probably productive, 
avenues of investigation from the perspectives of researcher, policymaker, and practitioner 
alike. At the same time, since the studies have not been conducted with a consistent focus 
on a common set of issues, many of the results reported remain uncorroborated across 
surveys. 

The results and hypotheses summarized in this chapter are drawn from published 
papers. The data have not been analyzed independently. An effort, however, has been 
made to report those results that have gained general acceptance (or are the focus of 
ongoing analysis) within the research community. 



Results Reported Across the lEA Studies 

Across the lEA mathematics and r,rience achievement studies, some systematic 
patterns of differences have been observed. 

/ . The more content students are taught, the more they learn, and the better they 
perform on the achievement tests. 

While this point may seem obvious, it reflects some important differences cross- 
nationally. From country to country the mathematics and science curricula vary 
considerably: as a result, students at the same grade level may bs taught more or less, and 
may be taught more or less intensively in a particular subject area. The result is more or less 
breadth and depth in learning. This proposition represents a theme woven through the lEA 
research. For instance, it has been shown that, in comparison with higher achieving 
countries, the American mathematics curriculum tends to be relatively shallow and narrow. 
A great deal of time is devoted to review and repetition, the work is generally less 
demanding, and teachers have lower expectations of students.*' Students learn what they 



^'See discussions in Curtis McKnighl ei al.. The Underachieving Curriculum; Charles Finn, "Afterword: A 
World of AssKsn^t. A Universe of Data" in International Comparisons, ed. Alan Purves (Alexandria, V A: 
Association for Supervision and Curriculum Development, 1989), 74-Sl; and R.A. Garden, "TTje Second 
lEA Mathematics Study," Comparaiive Education Review 31 (February 1987): 47-58. Low coverage, 
n^asured by "opportuni^ to learn" saves, emerges throughwt the lEA data in discussiois of differei^ es in 
achieven^t among students from various countries. One paper that looks at this issue from a policy 
perspective is Marshall Smith. "A First Look at the Policy Implications of the Findings of the Second 
Matlwmatics Study of the IE A" i^pa presented at the National Conferemre mi the Teaching and Learning 
of Mathematics in the United Slates, Champaign-Urbana, University of Illinois. 24 Sqjtember 1984). See 
also Lorin Anderson and T. Neville Postleihwaite, "What lEA Studies Say about Teachers and Teaching," 
in International Comparisons and Education Reform, ed. Alan Purves (Alexandria, VA: Association for 
Supervision and Curriculum Development, 1989), 74-81. 



30 

52 



are taught, and there are significant differences in the content of instruction among 
ccMintries at common levels of schooling. 

2 . Although international studies suggest that tracking as practiced in the United 
States seems to be negatively associated with student performance and student 
exposure to challenging coursework, some other countries have stronger forms of 
ability grouping that positively if^uence their assessment results. 

Tracking, or some type of classroom ability grouping, is standard practice in many 
American schools. Other countries also define the mix of students in schools and 
classrooms, but often this is not called "tracking." For instance, at the secondary level in 
some countries, national selection and placement policies filter students into ability groups 
or otherwise determine which students have access to college preparatory academic 
programs. This may not represent tracking in the American sense, but it has a similar 
effect, although within itiQSQ highly selective systems there may be little or no tracking. 
Neveitheless, all cwntries with nearly universal secwidaiy school enfollnwnt practice some 
form of **tracking," whether it is of students into schools, or of students within schools into 
classrooms. The importance of the distinction is that in countries where "tracking" is into 
schools, tracking is associated with higher performance levels. But in countries where 
"tracking" is within schools into classrooms, it is associated with lower performance 
levels. 

In terms of the international achievement surveys, tracking and selection practices 
affect the *'poor' of students participating in the international surveys. At the secondary 
level, survey targets from selective systems tend to be from academic {nx>grams. So it is 
perhaps not surprising that highly selective educational systems (which do not track 
students in the way that American schools do) tend to produce students who perform better 
on average in the international surveys than students from countries that do irack.^^ These 
studies have not investigated the effects of school and classroom tracking on students 
performing at lower achievement levels. 

Circumstances are different at presecondary levels before selection policies are in 
evidence. Here stuUents from systems that do not ability group tend to perform better in the 
aggregate on the intem^itional achievement tests.^^ It has been hypothesized that students 
from some of these countries perform well because there is significant cultural and social 
homogeneity. However, data from the international swyeys do not enable analysis of this 
notion, except in very general terms. In Japan, for example, it has been noted that at the 
pre^ondary level virtually all students are exposed to the entire mathematics curriculum, 
and there is no evidence that students have been sorted.^ 

Curriculum exposure, which is related to tracking, shares a common consequence, as 
Kifer writes: 



^See Edwaid Kifer, "What lEA Studies Say about Curriculum and School Organization" in International 
Comparisons and Education Reform, ed. Alan Purves (Alexandria, VA: Association for Supervision and 
Curriculum Development, 1989), 71. 

63william Plaii, "Policymaking and International Studies in Educational Evaluation," eds. Alan Purves and 
Daniel Levine, Educationai Policy and International Assessntents: Implications the !EA Surveys of 
Achievement (Berkeley: McCutchan, 1975); also Kifer, "Curriculum and School OrganizaUon," 71. 
^Leigh Burstein, ed.. The Second International Mathematics Study, Vol. Ill, Draft, (April 1990), chapters 
11 and 13. 



31 



ERIC 



...early tracking of students has a profound effect on chances for 
many to be exposed to learning experiences offered to a tracked 
elite. By Grade 8 in the United States, for instance, less than 15 
percent of the students are in a track that will require them to take 
calculus in Grade 12, so there is no way that system can produce as 
much knowledge as do systems without early tracking. The practice 
of tracking so early effectively eliminates tfie possibility for most 
students to cjmerience what is considered the best a school system 
has to offcr.*^ 

While it is not possible to estimate what proportions of the student population get to 
experience different types of mathematics instruction, country by country, McKnight notes 
that eighth-grade American students are found in one of four types of mathematics 
classes — ^"remedial," "typical," "enriched," and "algebra." Those in remedial classes were 
taught only about one-third of the algebra on the Second Mathematics Achievement Test, 
while those in algebra classes were taught almost all of the algebra on this test. More 
extensive differentiation in curriculum was found at this level of schooling in the United 
States than in any other country participating in the study 

Tracking is an issue of special interest to American policymakers and educational 
practitioners. Carefully controlled longitudinal studies in the United States have found a 
modest to non-significant relationship between tracking and student performance once pre- 
existing differences iri student ability and background are l»ld c(»istant^^ At the secondary 
level, the apparent negative association between tracking and international performance is 
obscured by the fact that there are so many different types of policies for dealing with 
ability differences that the definition of tracking is problematic. At the prcsecondary level, 
tracking in the American sense appears to be most directly related to exposure to a particular 
curriculum. Latit of a common definition of the term trackings applied uniformly across all 
of the countries participating in each achievement survey, suggests that these results must 
be viewed cautiously. 

3 . The schooling experience affects learning more in some subject areas t^ian in 
others. 

Certain subjects appear to be school intensive — i.e., more learning and mastery goes 
on in the classroom than outside of it. Among the many subjects that have been examined 
by the lEA, some sppeai to be more closely associate with school exposure than others.^ 
The import of schooling appears to be strongest for subjects such as science and much 
weaker for subjects such as foreign languages. Walker^' hypothesizes that this might also 
hold for mathematics, a curriculum in which parents are not necessarily knowledgeable, 
thereby increasing the school effects. 

4 . To the extent tliat family background characteristics have been captured in the 
international surveys, they liave been shown to have explanatory power cross- 
nationally. 



65Edwa«l Kifer. "What lEA Studio Say," 71. 
^^cKnight, Underachieving Curriculum, 106. 

^^KX. Alexander and M.A. Cook, "Curricula and Coursewofk: A Surprising Ending to a Familiar Siory," 

American Sociological Review (47) 1982, 636. 

68see, for example, Anderson and Posrieihwaite, 'Teachers and Texhing." 

^^David A. Walker, The lEA Six Subject Survey: An Empirical Study of Education in Twenty-One 
Countries (New York: Wiley, 1976). 228. 



ERIC 



32 51 



Confirming a well-documented finding in the United States, cross-national studies 
have demcmstrated relaiionships between family background and achicvcmentJ® To 
Americans this point may be veil un(terstood. Even though particular background variables 
may mean different diings in various countries, multivariate analyses in several lEA studies 
show some associations. In the between-schools analysis of all schools in the Pint Science 
Study, 33 percent of the explained variance was accounted for by home background 
variables for the 10-year-olds; 45 percent of the explained variance for the 14-year-olds; 
and 44 percent for students in the last year of secondary education."^ ^ In the First 
Mathematics Survey , a smaller proportion of variance was explained by home background 
because "opportunity to Icam'* variables were extremely powerful in that study's 
multivariate analysis (suggesting the importance of including a broader set of explanatory 
factors in the modcl).'^ These data suggest that among more developed countries at least, 
home background shows some relation to achievement patterns cross-nationally and that 
this is not uniquely a U.S. phenomenon. 

5 . Edm:ational systems committed to keeping students enrolled in school score less 
well on the international surveys, bat they formally educate a larger population. 
Japan is an important but lone exception to this proposition, calling the simplicity 
of this link into question. 

Over the three decades of lEA research, the impact of secondary school enrollment 
policies on achievement patterns has received considerable attention. E>ata from the studies 
suggest that countries retaining a large proportion of the eligible age group in secondary 
school (e.g., the United States and Sweden, which both have high levels of school 
"retention") tend to perform less well on the secondary-level achievement tests in part 
because a greate * range of student skills and capabilities are represented in the student 
population."'^ According to this argument, countries with higher rates of student retention 
are producing more knowledge across a larger population basc.^'* The issue of retention 



^^amily background variables are dis;u.ssed in e!u:h of Ihe lEA study reports and are often treatoi as groups 
of variables in muhiple regression analyses. See Purves and Postlethwaite, "Teachers and Teaching" and 
Anderson and Postlethwaite, "What lEA Studies Say." While this conclusicm may hold generally across 
more developed countries, Heyneman and others have analyzed the I£A data along with data from other 
sources. They report, among other things, that at the country level, the lower the incon« of the rountry. 
the weaker the influence of pupils' swial status on »;hieven«nt, and "...conversely, in low-income 
counuics. the effect of school and teacher quality on ^ademic achievement in primary school is 
comparatively greater." See Steven Heyneman and William A. Loxley, "The Effect of Primary School 
Quality on Academic Achievement acitKs 29 High- and Low-Income Countries," American Journal of 
Sociology 88 (6) (May 1983), 1162-94; and Steven P. Heyneman, "The Search fw School Efr«:ts in 
Devek)ping Countries" (Seminar Paper No. 33, Washington, DC: The World Bank Economic Development 
Institute, 1986). 

^^See David Walker, The lEA Six Subject Survey, 96-97. Four sets of variables were included in the 
analysis; home and background (including proxy SES n^asures and parents' education and occupation); 
school type and pn^ram (including class size and "onoortunity to learn"); and learning conditions and 
"kindred variabtes" ^attimd^, interests, motivation, out-oi-school tin» use, and so fraih). 
''^Torsten Husen, International Study of Achievement in Mathematics: A Comparison of T welve 
Countries, VoL 2 (New York: John Wiley, 1967), 286. 

73The magnitude of these negative relationships varies considerably from survey to survey, ranging from 
maiginal to substantial depending on the kinds of analyses undertaken. 

'74For excellent discussions, see David Robitaille and Kenneth Travers, "International Studies in 
Mathematics Education," forthcoming; see also David A. Walker, The JEA Six Subject Survey. 279; M. 
David Miller and Robert L. Linn, "Cross National Achievement with Differential Retention Rates," Journal 
for Research in Mathematics Education 20 (1) (1989), 28-40; Ian Westbury, "The Problem of Comparing 
Curriculum^ acrxMS Educational Systems," in International Conynuisons and Education Reform, ed. Alan 

33 



ERIC 



55 



tends to reflect broader educational and social policy objectives. As more countries increase 
their student retention rates at the secondary level, the issue may lose its power. Or, scores 
among some countries that previously had highly selective secondary systems may decline 
relatively as they retain a greater number and variety of students in school. But this is not 
always the case. In Japan, for instance, secondary school retention has increased 
dramatically over the past two decades, but Japanese students continue to perform near the 
highest level on both the mathematics and science surveys. This result appears to be unique 
to Japan. No other country with rapidly rising rates of student retention exhibits a 
comparable pattern. Further, these data are difficult to evaluate over time, and there is no 
way of knowing whether the high levels of performance among Japanese students in the 
1980s is as "high" as it was at an earlier point in time when retention rates were lower. 
However, this significant exception suggests the importance of further research on the 
issue of whether factors other than breadth of retention have a greater effect on student 
performance. The issue of school retention and selectivity highlights one of the areas in 
which sampling age-level cohorts might offer more representative national achievement 
estimates than grade-level cohorts, at least among secondary students. 

6. Generally, the "best students" in the United States do less well on the 
international achievement surveys when compared with the "best students" from 
other countries. 

Although this may reflect the nature of the school population (which is less selective 
in the United States), it deserves consideration. For example, on the algebra subtest in the 
Second Mathematics Study, achievement among the top r-cent of U.S. 12th-grade 
students was lower than achievement among the top 1 percent of any other country. On 
functions and calculus, the top 5 percent of U.S. students scored in a lower range than the 
top 5 percent of students from almost every other participating system.'^' Linn and Miller 
argue that while retention rates on the Second Mathematics Survey overall accounted for 
some achievement differences for the more able students, variables such as opportunity-to- 
leam were more important in explaining differences in achievement scores across 
participating systems.^^ On the lAEP, where 9 percent of American 13-year-olds 
performed at the second highest mathematics proficiency level, 40 percent of Koreans 
performed at that level. 

7. Students from less developed countries do less well on tests of achievement than 
students from more developed countries. 

As participation in the lEA studies has increased over the past two decades, the 
differences between more and less developed countries have become very clear. This seems 



Purves (Alexandria, VA: Association for Supervision and Curriculum Development, 1989), 31. In his 
analysis of data firom the first mathematics study, Husen hypothesizes that "higher levels of mathematical 
achievemoit will be attaint by a smatlo- proportion of tho&e still in school, but by a Imgei ^oponion of 
the total age gr(»ip." Torsten Husen, Imernational Study of Achievement in Mcuhermtics II: A Comparison 
of Twelve Countries (Oxford: Pergan»n Press, 1967), 128. 

^^McKnight, The Umkractueving Curriculum^ 26, 27. Although it is difTicuU to control for the eff«;ts of 
selectivity, McKnight wri^: "In order to control [for the] selectira effects, an an^ysis was matte of the 
average achievement in algebra of the top 1% and top S% of the age group in each country. Hie results 
showed thtt die U.S. came out as the low^t of any counuy fc^ which data v/ere available." Miller and 
Linn, following Husen 's anal^is in the fmt mathematics study, developed the jHoceduie for defining and 
calculating achievement semes of the 1 percent and 5 percent cohort. See M. David Miller and Robert L. 
Linn, "Cross-Nationa] Achievement" 

"^^M. David Miller and Robert L, Linn, "Cross-National Achievement," 38-40. 



34 



to be associated, at least, with differences between more and less developed countries in 
curriculum ccmtoit and in the grade level at which some subject matter is taught In genml, 
regardless of subject matter or age group, students from less developed countries do not 
perfom well on the achievement tests, even though the in-school population in some or 
these l^s developed countries is a small fraction of the age cohort and typically comes fh>m 
high-status families.'^ 



Selected Subject and Age Group Results and Hypotheses from Individual 
lEA Studies: Linkages to Achievement 

Since most of the international surveys were designed and organized by loose 
administrative consortia, data analyses have not been closely coordinated. As a result, 
researchers have pursued different agemias, ami while there may have been oppratunities to 
corroborate result across studies, this has not often occurred. Hence, many results and 
hypotheses appear to be associated with a particular study, subject area, or age or grade 
group — which may or may not be the case — simply because the groups were only tested 
one time. As the following discussion indicates, it seems important to encourage analytical 
replication in the future, so that hypotheses will be t^ted over time and across subject areas 
and age or grade groups. This section describes results associated with achievement and 
three types of variables in individual studies: curriculum, teaching, and instructional 
methods; student characteristics and family background; and organization of schools and 
instructicmal programs. 



Curriculum, Teaching, Instructional Methods, and AcHevement 

The international surveys have identified a number of linkages of performance and 
curriculum and teaching methods. 

The First Mathematics Study found considerable variation in curricula across 
systems, especially in the timing of instruction in particular topics and concepts. In some 
countries topics were taught much earlier than in others. (This was particularly true in 
highly selective enrollment systems.) Consistent with this proposition, the Second 
Mathematics Study concluded that students learned what they were taught, and those from 
countries with more demaiKling curriculum learned more of the kinds of items tested in the 
survey, and performed better. In other words, "... achievement follows content...." The 
study also revealed something that many Americans har' rot supposed possible — that 
students can be taught complex mathematics at a relatively early age.^* 

In the First Mathematics Study, the "opportunity to learn** variable emerged as an 
important indicator of performance, especially at the secondary level.^^ In the Second 
Mathematics Study, "opportunity to leam" was also closely associated with achievement. 
Among 13-year-olds, American students were more likely to have had an arithmetic-based 
curriculum. In other countries the curriculum was more likely to be based on algebra and 




'^'^For a discussion, see D. Spearitt, "Evaluation of National Comparisons," in The International 
Encyclopedia <^ Educational Evaluation, ed., HJ. Walbeig and CD. Haertel, (Oxford; Pergamon Press, 
1990), 51-59; and A. Inkeles, "National Differences in Scholastic Performance," Comparative Education 
^eWcw 23 (1979): 211-229. 
78Finn, "Afterword," 113. 

"'^icharf M. Wolf, Achievement in America: National Report of the VS. for the International Education 
Achievement Project (New York: Teachers College, 1977). 

"57 



geometry.^" As a consequence, where some countries were pressing forward with 
conceptually advanced curricula far students at an early age, American students were still 
focused on elementary mathematics skill-building. The American curriculum more closely 
resembled elementary schod as compared with the bener performing countries in which the 
curriculum was more like our eariy high school.^^ 

In the First Science Study, '^(^portunity to learn" proved central to understanding 
achievement scoe differences, especially for secondary-level students. But no relationship 
was found for younger students. Even so, it was notcxi that in some countries, including 
the United States, at the early ami middle grade levels science was often not taught as a 
separate subject Instead, science instruction was conduct^ by regular classroom teachers, 
many of whom may not have been equipped to build a foundation for specialized science 
learning in future years. By the last year of secondary school, there was ample evidence 
that success in science was distinctly related to the quality and extent of instruction. 
Different branches of science instruction were stressed to greater or lesser ftegrees from 
country to countiy, as measured by "opportunity to learn." "What students know about 
scientific subjects and ideas when they leave school is... to a large extent in the hands of 
those who design the curricula for the subject."*^ 

In the Second Science Study, as with mathematics, science curricula were 
distinguished by differences in riming — that is, when particular subjects were offered to 
students. At the lower and mkidle grade levels, there was substantial variation in the degree 
to which specific courses were available in each branch of science. Lower and middle grade 
students from systems in which more specific scientific instruction was provided 
perfumed better on the achievement tests. 

Distinctions in curriculum and instructional methods were characterized in other 
ways. The Second Mathematics Study documented differences in the level of difficulty of 
.;ie program. Students in the last year of secondary school from systems with higher 
retention rates, like the United States, were more likely to be studying algebra or 
trigonometry and less likely to be studying mrae complex subjects like calculus. ftx)gram- 
related differences were also found in the Second Science Study. In some countries 
students were required to take courses in each branch of science separately; in other 
countries they were only required to take general science courses, or they could choose a 
few courses among those available (limited requirements). When there were separate 
science course requirements, subject matter demands were greater, students were taught 
more, and they performed better on the achievement tests. 

Teacher preparation time also has been examined in relation to achievement. Several 
studies showed some relationship of teaching time and teacher preparation time with 
achievement. In both the Second Mathematics and First Science Studies, the amount of 
instructional preparation time for teachers in and outside of school was related to student 
achievement. Teachers in the United States and some other countries had little time 
available during the school day to plan for classes, and they did not spend proportionately 
more time preparing materials after school. 



^^obitaillc and Garden, Mathemmics, 238. 
^^McKnight ct a!.. Mathematics. 
*2walker. The JEA Six Subject Survey, 232. 



ERIC 



55 



Student and Famfy Background Characteristics and Achievement 

Although the data were not consistent with regard to studeiit characteristics, gender 
was generally associated with achievement across the international st^xlies. Gender 
difTerences in performance have attracted attention since the First Mathematics Study, 
which defined a number of issues. At the most basic level, the prc^K>rtions of girls enrolled 
in mathematics courses varied considerably. Further, among the younger populatiofl and 
also by and large at the pre-university level, lK>ys expressed mome interest in mathemarlcs 
than girls.^^ Finally, substantial differences in performance across gender were found 
among the younger sample population in all countries except the United States and 
Sweden, where the differences in performance between boys and girls were much smaller. 

in the First Mathematics Study, gender differences in achievement scores at the 
secondary level were greatest in countries with large proportions of single-sex, as opposed 
to coeducational, schools. At the same time, interest in mathematics among girls was higher 
in systems with large numbers of single-sex schools. But, while interest levels may have 
b^n higher in these schools, {^rformance was not enhanced. 

Gender-related patterns were not consistent, even between the two mathematics 
studies. In the Second Mathematics Study, among the older cohort, there were many more 
male than female mathematics students across all participating systems, and boys almost 
always outperformed girls. Among the younger cohort, girls outperformed boys in some 
topic areas. 

The First Science Study identified still another sex-related pauem. Boys showed a 
greater interest in science than did girls, a phenomenon that increased with the older 
cohons. Similarly, with reference to total science score by educational system, boys 
outperformed girls at all levels: at age 10 (by about one-quarter of a standard deviation); at 
age 14 (by about one-half of a stan(&rd deviation); and at tl» last year of secondary school 
(by about three-quarters of a standard deviation).^ In subject interests, boys Were more 
likely to be enrolled in physical sciences courses, and girls in biological sciences. By the 
last year of secondary school, boys generally outperformed girls in all science subjects, but 
the gap was considerably less in the biological sciences. In the Second Science Study, boys 
scored higher than girls, and the differences increased from elementary to middle school. In 
the lAEP, among 13-year-olds, boys and girls performed at about the same level in 
mathematics; however, this was not the case in science. Except in the United States and the 
United Kingdom, boys systematically j^rformcd better than girls. 

Beyond the question of gender and performance, some aspects of family background 
should also be mentioned. In the First Mathematics Study, student and family background 
were associated with performance to a greater degree in the United States than in other 
countries. Particularly among the eighth-grade sample, scores were related \d parents' 
education and father's occupation. In countries other than the United States, the import of 
these background variables declined at the secondary level, perhaps because enrollment 
selection ]x>licies homogenized student profiles in the later years of school.^^ The lAEP 
found another relationship among family life, activities outside of school, and science 



*^Husen, Mathematics, Vol. 2, 305, 
^'^Cmnbcrand Keeves. Science Education, 139-53. 
85lbid., 303. 



o 

ERIC 



37 

59 



performance: doing better in science was associated with such variables as parents' talking 
with chiklien about science topics at home.^ 



Organization of Schools and cf Instruction and Achievement 

One other set of issues, bridging several studies, concerns achievement and the 
structure of schools. While the classroom was not explcned in a consistent fashion in 
relation to achievement variables in the First Mathematics Study, class si25e did not show a 
systematic relationship to performance. (In fact, some d the countrks with the largest class 
sizes produced some of the highest achieving students.) In the Second Science Study, 
except for the youngest students, class size was not related to achievement. Again, 
countries with the largest classes tended to have the best scores.^ 

Similarly in the Second Science Study, across countries, school organizarion 
variables including total hours erf school each week, of mathematics instruction each week, 
of homework, and of mathematics homework showed virtually no relationship to the 
scores of the 13-ycar-old sample. For students in the last year of secondary school, hours 
of mathematics instruction and of mathematics homework showed a small positive 
relationship to achievement.*^ 



Summary 

This chapter has selectively summarized results and hypotheses associated with the 
international achievement test scores. The first section described results tiiat held across the 
smdies at a general level, while the second section focused on lines of inquiry related to 
individual international achievement studies. To a degree, the studies arc so different in 
analytical focus that the results reported seem rather eclectic. To the extent that these results 
can be pursued systematically in future research, policymakers may be able to find more 
ways of applying international findings on curriculum, instruction, and organization, and 
achievement to issues of schooling in America. 



**Uqx>inte, A World of Differences, 45. 
*''PosUeUiwaite, Science, chapt^ 8 and 9. 
^^HiBOi, Mathematics, 300. 



38 6U 



Chapter V 



Looking Ahead: Toivard Future 
International Achievement Surveys 



After three decactes of research, tl^ (x>m]»rative education community has inroduced a 
series of important studies examining differences in maUiematics and science achievement 
anumg students of different ages from a number of developed and less developol countries 
worldwide. Differences in achievement have been observed, but their magnitude is 
uncertain. Because of inconsistencies in sample design and sanwling i^jcedui^, the nature 
of tfie samples and their outcomes, and other factors, it is difficult to know the degree to 
which the past surveys accurately measure student performance across comparable 
populations from country to country. 

Despite their shortcomings, international achievement surveys arc now highly valued, 
providing a way to explore the import of many schooling inputs and processes that can best 
be observed crozis-nationally. While results of the international assessments document 
many differences in the nature and organization of educational systems, achievement scores 
and countiy-by-country performance rankings have received the most attention. The 
interest in scores and rankings demands that the data used by U.S. policymakers and 
educates n^ high technical standards. 

There is considerable evidence that the various international testing authorities and 
consortia are moving expeditiously toward improving the quality of the surveys and 
upgrading their statistical reliability. The National Center for Education Statistics (NCES) 
supports and encourages these efforts because they serve to enhance the utility of these data 
for policymakers and education practitioners. Recognizing that the comparative 
achievement scores and country rankings are likely to become even more visible in the 
future, it is essential for the design of new international studies to reflect lessons learned 
from die past. 

This concluding chapter discusses a number of isf ues raised by the National Research 
Council's Board on International Comparative Studies in Education (BICSE),^' and it also 
focuses in greater detail on matters related to the design of international achievement 
surveys, the ways the results of the international assessments are reported, and the nature 
of the reporting process itself. Where BICSE outlines broad strategies for the assessment 
process, this chapter offers a more strategic look at issues concerned with the data. 



Areas of Improvement: Sample Comparability and Reporting International 
Achievement Scores 

Congress, the Executive Branch, the media, and much of the general public continue 
to focus attention on test scores and country rankings describ^ in the international 
surveys. This poses a real challenge to the comparative ^ucation community. Researchers 
must continue to elaborate and refine the ways in which they measure cross-national 
achieven^t aiKl must continue in their efforts to descril^ wh^ differences occur. But to the 
extent that the international achievement scores serve as visible "leading indicators'" of 



^^See N(ffmaa M. Bradburn and Dmotfay M. Gilford, eds.. "A Framewofk and Principles for International 
Conq»nttiv« Studte in Ecfaioaion'* (Washington, DC: National Academy Pr^. 1990). 



39 



educational competitiveness, it is essential that reports accurately reflect real differences in 
achieven^t amcHig sampled peculations. 

Early international achievement survey results were compiled from country data of 
widely varying quality. If cross-national comparisons are to be scientifically credible and if 
policymakers are to rely on the findings, then stringent data collection standards must be 
established aaid achiev«i by all participating entities. Two ways of improving data quality 
in future studies are strengthening sample comparability and adding some adjusted 
international score reports that might make it easier for audiences without technical 
backgrounds to accurately inteipret findings. 



Sample Ccmpwability 

A standards' review procedure could help assure that reported findings are based on 
accurate, representative sample estimates. To that end, it would be useful to examine 
sampling outcomes from the standpoint of comparability and representativeness before and 
after data are collected and before extensive analysis. This would enable researchers to 
ascertain the degree to which samples represent targets and it would encourage participants 
to devote more attention to sampling issues during development of the survey process in 
each countiy. At least six questions concerning samples and data quality have arisen from 
the intematicmal surveys to date: 

1 . To what extent did the samples meet the study design requirements? 

2. Were there differences among countries in how the target populations and 
eligibles were defined? Did each countiy follow identical procedures? 

3. How were modifications to the sample handled? For instance, when countries 
legitimately sample target populations that are not thoroughly comparable with 
those of other nations participating in a study, were these noncomparable 
circumstances articulated, justified, and their implications discussed?^ 

4. Were the response rates adequate on a country-by-country, stratum-by-stratum 
basis? 

5 . Did the characteristics of tiiose declining to participate (or excluded from testing) 
differ substantially from country to countiy? Within countries, did this affect the 
degree to which the achieved sample represented the target population? Were the 
characteristics of schools in the design sample but not in tiie achieved sample 
compared, and were the comparisons reported? 

6. Did the age distributions of lest samples differ substantially, and if so, what were 
the analytical implications? 

Achieving sample comparability represents an important, but still elusive, goal. As 
described in tiie following section, significant change in research designs are being made, 
and advances are evident on issues of comparability. But at this point, in some international 
surveys the composition of samples (and of the units sampled) differs substantially from 



^OSanyUng differences may be important. In the lAEP, the Inner London Education Authority dechned to 
participate, so no testing took place in England's principal, and most demographically diverse, city. In the 
First International Science Smdy, only the six Indian states in which Hindi is the official language were 



40 



country to country. Two strategic issues underlie the question of conqjarabiliiy — age-level 
and graic-level sampling. If o^^ is the basis for sampling, then all participating entities 
must assure representative samples of the same age cohoil This becomes complicated, of 
course, because from country to country one age group may bridge several schooling 
grades, ot in a given country, one age group in school may be more or less representative 
of a national age cohort The BICSE repeat cmiAasized the importance of defining sampled 
populations in similar w&ys and assuring **comparable coverage of the populations."^ ' 
Further, surveys should be able to "support reasonably accurate inferences about an age or 
grade cohort, and the poportion of each cohort covered should be carefully estimated and 
reported. The sample snould be desigiied to ensure that it captures the range of individual, 
school, or classroom variation that exists in the nation sampled.'^^ Sampling age cohorts 
enables comparisons of particular age groups in each country, but it is more costly given 
the expense associated with finding and testing students who are out of school and those at 
different grade levels. In contrast, if grade is the basis for sampling, all participating 
nations should strive to provi(te comparative information about students who have been in 
school for the same number of years.'^ Further, grade sampling ofTers the opportunity to 
relate classroom characteristics (e.g., classroom processes and teacher practice) to student 
performance in ways that would not be possible with an age-based sample. 

Solving problems like those associated with age versus grade cohort testing 
represents a si^ificant concern in terms of deriving samples that are analytically eqmvalent 
across all participating countries and meeting the intended purposes of the assessments. 
Evidence from the lEA Reading Literacy Study (see below) suggests that this dilemma is 
well recognized and that steps are being taken to improve prospects of sampling 
coniparability or, at lei^t, assure Uiat minimum standards are achieved on future lEA 
achievement surveys. In all cases, the objective should l« to assure accurate comparisons 
of achievement between countries across school and age cohorts, even within the context of 
different policies for selecting and retaining students in school.** 



sampled. In both cases, results were reported, and national data were used for international comparisons 
without a discussion the analytical implications. 
'^Bradbum and Gilford. "Frameworic," 9. 
92lbid.. 25. 

*^In the BICSE repc^ the following is noted; ..it is nci clear whether students should be tested according 
to t}»ir age or tl^ir years in school. Children start school at different %es; first gr^lers may be S, 6, or 7 
years old.... Grade progression also occurs at different rat» am}ss c(nintries. Some of die Nordic countries 
have policies against repetition. Thus, if one were interest^ in evaluatii^ «;hieven»nt at about the 
transition b^ween "lowa^' araJ "nii(Wle" school, shcwld one t^t ftMirtfi gratos or 9-year-olds7 In comparing 
systems with different age rules for school entry, tl^re may be quite laige diffoeiKes in the average age of 
student...." Br^bum and GUford, "Fran^oik," 8. 

**The ongoing Educati<m Indicators Project at the OLCD raises the question of comparability in data 
coll«;tion '^...givoi the p(»sibility of widesi»i^ system differraces." By way of example, the comparability 
issue is clearly articulated in die folbwing: 

Nations differ in die pattern and intenstQ/ of their science instruction. Some prefer exposing 
secondary students to an array of scientific subjects. OUiers choose to immerse sectaulaiy studoits in 
one or a relative few subjects. [Of course, somi nations may offer no science at all.] An 
international comparison of secondary school biology <x physics knowledge, in tl^ absence of 
informati(Mi regarding laming oj^rtunities fox students, might te^ to inaccurate conclusions 
regarding a naticm's school effectiveross or students' ability tevels. Similarly, s<hi» natiras delay 
the om&. of fnmal instructim until a later chronok^ical age than (Mhers. assessing reading 
ability at an early age may provide a misleading comparative picture of a particular nation's 
educational achievement level. (Organization for Economic Cooperation and Development, 
"Assessing Assessment: Coi^iderations in Selecting Cross-National Educati(»al P^ormance 
Indkatt^s," draft leptnt, November 1990.) 



41 



Reporting Aclueyfement Scores 

Beyond comparability of data, there is the equally impc^lant issue of how data are 
reported and what is reported. Much that has been learned about the international 
achievement survey results suggests that mean score and country ranking reports require 
careful qualification, elaboration, and piovision of context for interpretaticm. 

Among other things, it is important to discuss factors likely to influence country 
scores. 

• Where syswms of education andfundamenuU national polices affect mean scores 
and rankings, these differences fihould be accounted for in the reporting process. 
For instance, at the secondary level, national or local school retention policies in 
and of themsdves are apparently related to achievement scores. 

* Where differences in curriculum positively or negatively affect students who are 
taking the international tests, these difference must be articulated. The curricula 
in some systems are quite consistent with the elements tested on the surveys. In 
these instances, students are likely to answer more questions correctly. This 
means Htm a priori students from some countries are likely to do better on the tests 
than their peers from other countries. Using the opportunity to learn indices, 
curriculum advantage should repealed along with test scores. 

• Where the test formats themselves may affect outcomes, these need to be 
investigated and discussed. For example, students from some countries may be 
more or less familiar with achievement testing generally and with the particular 
formats used in comparative studies. To the extent that they are known, reports 
must account for such differences. Further, there are countries in which students 
are exposed to a great deal of testing, and, as a result, participating students may 
expend less effort on **low-stakes tests" that they believe do not affect their 
educational futures. Hius, in reporting data, researchers must consider the 
potential consequences of student indifference.'^ These kinds of issues could be 
addressed in the reporting process, clarifying fundamental differences across 
participating entities that may be associated with scores and rankings. 

In addition, new data dissemination formats could be constructed as a way of moving 
beyond comparisons of measured mean scores. The following might be two options: 

• Developing sets of scores that would reflect each system's achievement against 
items common to its curriculum. 

• Developing sets of scores against some minimum or optimum performance 
standards, agreed upon by all participating educational systems for the purpose of 
defining the proportions of students achieving at or above that level for a given age 
or grade level. This approach would, admittedly, be more difficult because of the 
problem of achieving international consensus on such a matter, but BICSE 
suj^rts alternatives of this sort with appropriate caution: 



^Bradbum and Oilfad, "Pramewoiic," 31. 

42 



o 64 

ERIC 



Studies concCTncd with student achievement data can be enhanced 
considerably by reporting outcomes in terms of performance 
standards, for example, the percentage of students who know 
everyday science facts or who use scientific procedures and analyze 
scientific data. This can be difficult to accomplish, however, and 
there is a risk that arbitrarily established standards will lead to 
serious misinterpretations of achievement levels. If results are 
reported relative to specified performance levels (e.g., functional 
literacy), the basis for establishing these levels must be explicit, 
defensible, and responsive to the needs and contexts of all the 
nations involved.** 

• Providing standard errors or confidence intervals for all estimates. 

In general, as noted by BICSE, reporting should be 

sensitive to technical limitations on a study's interpretability. 
Limitations might mclude caveats about the comparability of national 
samples, the limited number of test items or range of content on 
which comparisons are based, differences in administration 
conditions from place to place, the match of tests to different 
curricula, the difficulty of translating exercises from one language to 
another, the limited precision of sample statistics, or other 
qualifications on study findings.*^ 



A Place for Small-Scale Sfdies 

Small-scale, intensive case studies can enrich the presentation and interpretation of 
data fiiom the large-scale international achievement studies. Harold Stevenson's study of 
first and fifth graders in Minneapolis, Taipei, Taiwan, and Sendai, Japan exemplifies a way 
of describing the corrclatcs of achievement and developing hypotheses that deserve special 
attention.'* Intensive, small-scale projects have an important role to play in building the 
international information program— what they lack in breadth, they achieve in depth. At 
this micro-level, differences among systems of education can be examined in considerable 
detail. For instance, Stevenson's work suggests that some of the factors underlying 
differences in achievement between U.S., Japanese, and Taiwanese children are in 
evidence as early as the first grade. Clearly, if this kind of finding were sustained in other 
case studies, an important new dimension might be added to the international achievement 
debate. 

Small-scale studies can serve a variety of purposes in the international arena: 

1 . They can be used to help identify issues and to develop measures for study in 
more generalizable, large-scale sample surveys. 

2 . They can be used to identify hypotheses appropriate for measurement with large- 
scale n^thodologies, or to study variables that may of interest to a few countries. 



^Ibid., 33. 
''ibid., 31. 

'^H^ld W. Stevenson et al., Making the Grade in Mathematics (Reston, VA: National Council of 
Teachers (^Mathematics, 1990). 

43 



ERIC 



63 



3 . They can be used to test new data collection methods. 

4. They can be used to help ensure that data from large-scale studies are interpreted 
appropriately by providing richer contextual description. When conducted along 
with large-scie studies, detail is provided that is often missing from purely 
statistical presentations. 

Many of the ways in which the fmdings of cross-national differences in large-scale 
studies come to be understood are based on analyses derived from small-scale studies. 
While they may not gain the visibility of the large-scale intematimial surveys, these studies 
should receive adequate support and attention. Small-scale ethnographies, longitudinal 
studies, and case studies represent significant opportunities to quickly learn more about 
which exogenous variables and schooling inputs and processes are systematically linked to 
periformance outcomes. In some instances detailed, purposeful studies of specific 
phenomena in a small number of comparable countries may be the best way to identify 
variables appropriate to test in a broader variety of settings. 



Evidence of Progress: lEA Reading Literacy Study, lAEP-II, and the Third 
International Mathematics and Science Study 

Two recent international achievement surveys indicate that issue?i of sampling and 
sample comparability are receiving more attention than has been so in the past In 1990 and 
1991, the TEA conducted a study of students from 43 educational systems for the purpose 
of 1) describing the types and levels of reading literacy; and 2) examining the impact of 
varying educational policies and programs as well as home influence on reading literacy. 
Two populations were sampled: students in the grade in which most 9-year-olds are 
enrolled; and students in the grade in which most 13-year-olds are enrolled. Investigators 
were specifically interested in comparing reading achievement among comparable samples 
of students in participating educational systems.^ 

The sampling procedures adopted for the Reading Literacy Study arc to be replicated 
in the Third International Mathematics and Science Study (TIMSS).i«> Therefore, they are 
worth mentioning here. The Reading Literacy sampling manual, data collection procedures, 
and data coding and cleaning manual were all considerably improved over pnor studies. 
Detailed field administration procedures were easier to follow, and across all participating 
countries. Education Ministry involvement was significantly increased to facilitate the 
pixjcess of drawing and executing the samples. The impact of these changes were dramatic, 
at least for the United States. At the fourth-grade level, school and student response rates 
were 87 percent without replacement, while at the ninth-grade level, response ratra were 86 
percent With more attention to coordination and administrative detail, it appears that the 
overall quality of the data for many countries and the United States will improve. Equally 
important, it appears that the countries participating in the Reading Literacy Study were 
be&r able than they were previously to estimate financial need and generate sufficient 
support to enable higher quality data collection. 




^^International Association for the Evaluation of Educational Achievement. lEA Guidebook 1989 (The 

IWlniemational Association for the Evaluation of Educational Achievement, "A Bneflnm^uction to the 
Third lnttmatk)nal MathematHS and Science Study** (September 1990); ai^ Jeanne Gnffitfi. Eugene Owen, 
Lois peak and Elliott Medrich. "National Education Goals and the Third Intemaoonal Mathematrcs and 
Science," (p^ jawented at the Annual Meeting of the American Statistical Association, Atianta, August 
1991). 

44 

66 



r.^ ^AEP-II, which tested 9- and 13-year-olds in mathematics, science, and geography 
(13-ycar;0lds only) expanded janicipation beyond that of the first lAEP. In the 1990-91 
test administration, 28 educational systems participated in one or more of the assessments 
Student m6 school background questionnaires were expanded, although there was stili 
considerable variation in the relative emphasis educational systems assi^ed to the various 
topics covered by the tests. ^01 Even so, with its careful sampling strategy, field design, 
and increased participation, lAEP-II has addressed some important questions that were 
raised by the earlier study. 

Both of these studies suggest that the quality of the international assessments is 
improving. When the results of these surveys become available, their value will be 
enhanced to the extent that there will be full and complete information on: 1) sampling 
procedures and field execution; 2) comparability of the samples across participating 
educational systems; and 3) issues that could affect how the data are interpreted, so that 
published comparisons are appropriate to the nature and quality of the data. 



Utility of International Studies for the Policy Agenda 

The bridge between descriptions of performance and matters of policy may be among 
the least satisfying aspects of the international assessment survey literature. While some 
argue that these assessments should remain broadly focused on describing differences in 
achievement, there have been efforts to inform a policy research agenda — some planned, 
others post hoc. While some of the policy analysis has been provocative, often it is 
inconclusive because the research was never intended or designed to answer the questions 
posed. Within the constraints of large-scale survey methodology, efforts should be made 
by rcseaiphers to design studies and analyses that tap issues of special interest to the policy 
community. So, for example, if there is interest in looking strategically at the substance of 
successful programs— that is, policies or conditions that seem associated with superior 
pcrfOTmance outcomes— appropriate methods and questions must be built into the research 
design. Further, it is understood that successful practice in one country may not necessarily 
work in another. Differences in culture and approaches to schooling and teaching arc 
powerful intervening factors. But the international surveys can and should help to isolate 
aspects of the teaching and learning process that are amenable to policy intervention and, 
therefore, of interest across national borders. 

At the same time, there may be issues of interest to subsets of countries only. Within 
limits, surveys should be flexible enough to enable substudies designed to explore 
questions of concern to groups of countries. This poses many problems, not the least of 
which have to do with resources and time. If each participating country were to pursue its 
own agenda within the context of the international surveys, the testing mechanism might 
collapse. 

The first priority must be to ensure the quality of the common data program. An 
important step in this direction is to assure adequate planning and provision of sufficient 
resources to assure timely, high-quality completion of all phases of the study. 



*^llntemati(»ial As^ssment of Educational Progress. Center for tl^ Assessment of EducatJonal Progress. 
**Tht 1991 lAEP Assessment: Objectives for Matltenutics. Sciemie ami Geography" (Princeton: Educational 
Testing Service, 1991). 



45 



67 



Conclusion 



Large-scale cross-sectional and cross-cultural studies of student achievement yield 
data that are increasingly important indicators of the success of national efforts to educate 
an accomplished citizenry and a productive work force. The United States should 
participate in these studies enthusiastically to foster international cooperation and the 
sharing of hilbrmaticm on all aspects of educatk>n, to enrich our understanding of our own 
system of education, and to l«lp uncover practice in other educatiwial systems that might 
help improve achievement among Amwican students. 

The objectives of this report have been to make available substantial documentation 
and description of the various international achievement surveys of mathematics and 
science; to identify aspects of the survey design, data collection jwocesses, and reporting of 
results that could be improved; to synthesize some of the important findings or hypotheses 
generated by these studies; and to suggest some strategies for upgrading data quality m 
future studies. 

There are clear indications that many of the concerns discussed in this report will be 
addressed in international studies now being designed and implcmcntwi, such as the Third 
International Mathematics and Science Survey scheduled for 1993-94. As these kinds of 
studies attTKt more attention, it b«»mes essential that they meet high techmcal standards. It 
is also important that every effort be made to help those who find these data infonnative 
and usefiil to underetand the possibilities and limitaUons of the survey results. Participating 
countries have learned an enormous amount about the challenges of conducting complex, 
large-scale inlemavional surveys. They have also learned a great deal about the problems 
associated with interpreting the results of these studies. The activities now in progress 
provide substantial evidence that the quality of these surveys will improve in the future. 



46 



Appendix A 
Achieved Sample Size and Response Rates 



47 

i;5 



Table A.l 

Sample size and response rates — schools and students 
First International Mathematics Study, 13-year-olds 



„^ . , SchfifilS Students 

Educational Achieved Response Achieved Response 

system sample mt sample rate 

(percent) (percoit) 



Australia 


72 




3,078 


«*«< 


Belgium 


61 




2,645 




England 


182 




3.179 




Federal Republic of Germany 


161 




4,475 




Finland 


111 




1,325 




France 


124 




3,850 




Isr^l 


154 




3,336 




Japan 


210 




2,050 




Netherlands 


90 




1,443 


m " 


Scotland 


73 




5,949 




Sweden 


80 




3,712 




United States 


395 




6,733 





--Not avidlabte. 

NOTE: Data grade level containing the most 13-year-olds (population lb). 

SOURCE: Data from T. Husen, International Study of Achievement in Mathematics: A Comparison of Twelve 
Countries, Vol. I (New York: John Wiley & Sons, Inc., 1967), 158-61. 




49 



Table A. 2 

Sample size and r^ponse rates — schools and students: 

First International Mathematics Study, 

Last year of secondary school (mathematics students) 





Schools 


Sttidenu 


Educational 


Achieved Response 


Achieved Response 


system 


sample rale 


sample rate 


(percent) 


(percent) 



Australia 
Belgium 
England 

Federal Republic of Germany 

Finland 

France 

Isr^I 

Japan 

Netherlands 

Scotland 

Sweden 

United States 



56 


1,089 


30 


519 


77 


1,031 


37 


649 


27 


460 


14 


337 


8 


146 


91 


818 


30 


491 


63 


1,422 


23 


1,024 


149 


1,660 



-Nc« available. 

NOTE: Data fw mathematics students in last-year sawidary (population 3A). 

SOURCE: Data finom T. Husen, International Study of Achievement in Mathematics: A Comparison of Twelve 
Countries. Vol. I (New Yoiic: John Wiley & Sons. Inc., 1967), 158-61, 



50 



7l 



Table A. 3 

Sample size and response rates — schools and students: 

First International Mathematics Study, 

Last year of secondary school (non-mathematics students) 



Educational 
system 



Schools 



Achieved Response 
sample rate 

(percent) 



Students 



Achieved Response 
sample rate 

(percent) 



Belgium 


43 




1,004 


England 


84 






Fedteral Republic of Cjcrmany 


36 




643 


Finland 


24 




482 


France 








Japan 


349 




4,372 


Netherlands 








Scotland 


64 




2,123 


Sweden 


20 




320 


United States 


155 


mm 


2,152 



Not available. 



NOTE: Data fcs non-matfiematics students in last-year seccmdiiry (population 3b). 

SOURCE: Data from T. Husen, Iniernaiional Study cf Achievement in Mathematics: A Comparison of Twelve 
Countries, Vol. I (New York: John Wiley & Sons. Inc., 1967), 158-61. 



ERIC 



" 72 



Table A.4 

Sample size and response rates— schools and students: 
Second International Mathematics Study, 13-year-olds 



Edis»tional 


Achieved 


Response 


Achieved 


Response 


system 


sample 


rate 


sample 


rate 






(ipercent) 




(percent) 


Belgium (Flemish) 


158 


— 


3,103 


— 


Belgium (French) 


108 


86 


3,103 




Ca^da (British Cdumbia) 


89 


~ 


2,228 




Canada (Ontario) 


112 


86 


5,013 


— 


England and Wales 


94 


82 


2,678 


84 


Fmland 


98 


95 


4,484 




FraiK:e 


187 


99 


8,889 




Hong Kong 


125 




5,548 


mm 


Hungary 


70 


100 


1,754 


95 


Isr^l 


81 


82 


3,819 


78 






Q7 


8 091 




Luxembourg 


42 


91 


2,106 


96 


Netherlands 


236 


100 


5,500 


mm 


New Zealand 


100 


100 


5,218 


mm 


Nigeria 


48 


72 


1,456 


72 


Scotland 


76 




1,356 


67 


Swaziland 


25 


100 


904 


mm 


Sweden 


96 


96 


3,585 


88 


Thailand 


99 


99 


4,023 


95 


United States (Districts) 


93 


50 


O 


C) 
76 


United States 


150 


83 


6,858 



-Not available. 
*Not ^licable. 

SOURCE: Data from U.S. Department of Education, Center for Education Statistics, Robert A. Garden, Second 
lEA Mcahematics Study: Sampling Report (Washington. DC, March 1987), section 4. 



52 



o 

ERIC 



73 



Table A.5 

Sample size and response rat^ — schools and students: 

Second International Mathematics Study, Last year of secondary school 



Educaticmal 
system 



Belgium (Flemish) 

Belgium (Erench) 

Canada (British Columbia) 

Canada (Ontario) 

England and Wales 

Hnland 

Hong Kong 

Hungaiy 

Israel 

Japan 

New Zealand 
ScoHaiKl 
Sweden 
Thailand 

United States (Districts) 
United States 



Schools Si 

Achieved Response Ach^ved 

sample rale sam^ 

(pacait) 



Response 

rate 
(percent) 



131 


87 


2,859 




87 


77 


2.062 


mm 


78 




1,954 




79 


93 


3,214 


mm 


312 


90 


3,578 


mm 


81 


92 


1,550 


88 


112 




3,294 


mtm 


92 


100 


2,455 


97 


64 


70 


1,905 


72 


192 


93 


7,954 


100 


79 


99 


1,193 


98 


54 


81 


1,501 




127 


98 


2,712 


93 


64 


100 


3,747 


90 


93 


48 


(♦) 


(*) 


150 


69 


4,671 


77 



-Not availabte. 

SOURCE: Data ftom U.S. Department of Education, Center for Education Statistics, Robert A. Garden, Second 
lEA Mathematics Study: SampUng Report (W^hington, DC, March 1987), section 5, 



Table A.6 

Sample siie and response rates — schools and students: 
First International Science Study, lO-year-olds 



Schools Smdcnfi 

Educational Achieved Resp(mse Achieved Response 

system sample rale samfde rate 

(percent) (paccnt) 



Belgium (Flemish) 31 

Belgium (French) 33 

Chile* 81 

England 162 

Federal Republic of Gennany 68 

Finland 97 

Hungary 152 

India* 176 

Iran* 53 

Israel* 110 

Italy 298 

Japan 250 

Netheriands 60 

Scotland 105 

Sweden 98 

Thailand* 31 

United States 239 



59 


717 


42 


77 


767 


70 


82 


1,470 


80 


79 


3,573 


73 


46 


1.741 


46 


97 


1,305 


97 


99 


4,858 


95 


50 


2,704 


26 




1,640 




97 


1.887 


92 


73 


4,503 


49 


100 


2,467 


100 


66 


1,629 


65 


98 


2,169 


92 


99 


2,009 


96 


94 


1,810 


82 


68 


5,479 


64 



-Not avaitfd>te. 

*Mean {K^venttitt xot^ not calculated, or not jHiblished: or system did not participate in the achievenwnt test 
survey. 

SOURCE: Data from Gilbert Peaker, An Empirical Study of Education in Twenty-One Countries: A Technical 
Report (New Yodc: John Wiley & Sons, Inc., 1975), 36, 37. 



54 



Table A. 7 

Sample size and response rate— students and schools: 
First International Science Study, 14-year-olds 



Educaticmal 
system 



Schools 



Achieved Response 
sample rate 
(percent) 



Achieved Response 
sampAc rate 
(percent) 









5,301 


96 


ncigiujii ^x^icinisn/ 


i^l 




699 


39 




Zl 


34 


564 


22 


Chile* 


103 


75 


1,311 


72 


&igland 


146 


66 


3,256 


60 


Fedbral Republic of Germany 


83 


59 


2,233 


56 


Finland 


77 


100 


2,325 


98 


Hungaiy 
India* 


210 


100 


7,026 


94 


155 


44 


2,931 


40 


Iran* 


33 




1,336 




Israel* 


125 


91 


1,958 


80 


Italy 


343 


86 


7,383 


83 


Japan 


196 


98 


1,945 


98 


Netherlands 


50 


52 


1,236 


49 


New Zealand 


74 


100 


1,974 


91 


Scotlaml 


70 


95 


1,982 


85 


Sweden 


95 


96 


2,475 


91 


Thailand* 


29 


91 


1.932 


81 


United States 


142 


57 


6,870 


46 



•Not available. 

Mean achievement scores not calculated, or not jmblished; or system did not participate in the achievement test 
survey. 



SOURCE: Data from Gilben Peaker, An Empirical Study ofEducaiion in Twenty-One Countries: A Technical 
Report (New Yoik: John Wiley & Sons, Inc., 1975), 36, 37. 



55 



Table A. 8 

Sample size and r^ponse rates — schools and students: 

First International Science Study, Last year of secondary school 



Educati(»ial 


Achieved 


Response 


Achieved 


Response 


system 


sample 


rate 


sample 




(percent) 




(percent) 


Australia 


194 


99 


4,197 


92 


Belgium (Flemish) 


18 


34 


472 


27 


Belgium (French) 


42 


63 


1,231 


59 


Chile* 


73 


76 


2,052 


74 


England 


70 


32 


2,274 


27 


Fecial Republic of Germany 


80 


80 


1,988 


71 


Finland 


77 


100 


1,807 


82 


France 


141 


90 


3,582 


87 


xiungaiy 


»>y 


100 


2 855 


98 


India* 


124 


31 


3,153 


28 


Iran* 


34 




1,435 




Israel* 


71 


84 


863 


81 


Italy 


253 


70 


16,437 


61 


Nedierlands 


38 


39 


1,164 


37 


New Zealand 


69 


100 


1.714 


83 


Scotland 


69 


88 


1,328 


80 


Sweden 


142 


95 


2,988 


90 


Thailand* 


13 


93 


724 


66 


United States 


114 


43 


5,200 


35 



-Not avail^le. 

"^Mean »:hte^inem scores not calculate or not {niblished; or system did not participate in the achievement test 
survey. 

SOURCE: Data from Gilbert Peaker, An Empirical Siudy of Education in Twemy-One Countries: A Technical 
Report (New York: John Wiley & Sons, Inc., 1975), 36, 37. 



56 y 



Table A.9 

Sample size and response rat» — schools and students: 
Second International Science Study, 10-year-olds 



^hfiols Smdems 

Educationai Achieved Response Achieved Response 

system sample late sample rate 

(percent) (percent) 



Australia 

Canada (English) 

Digland 

Finland 

Hong Kong* 

Hungary 

Italy 

KOTea 

Norway 

Hiilil^ines 

Poland 

Singapore 

Sweden 

United States 



220 


78 


215 


69 


181 


66 


106 


96 


146 


99 


100 


100 


119 


58 


221 


99 


146 


99 


91 


62 


463 


93 


199 


100 


232 


92 


64 


70 


123 


88 



4,259 


67 


5,104 


67 


3.748 


62 


1,600 


86 


5,342 


96 


2,590 


95 


5,156 


84 


7,924 


99 


3,489 


99 


1,305 


54 


16,851 


92 


4.390 


93 


5,547 


92 


1,449 


74 


2,822 


77 



^Sampled dieses, not schools. 

SOURCE: Data from Intemational Association for the Evaluation of Educational Achievenwnt, Student 
Achievement in Seventeen Countries (Oxford: Fergamon Press, 1988), 96, 81. 



57 

73 



Table A.10 

Sample size and response rates — schools and students 
Second International Science Study, 14-year-old5 



Educational 
system 



Schools 



Achieved Response 
sample rate 

(percent) 



Students 



Achieved Response 
sample rate 
(perceit) 



Australia 

Canada (English) 

England 

Finland 

H<»ig Kong* 

Hungary 

Italy 

Japan 

Korea 

Netherlands 

Norway 

Philippines 

Fbland 

Singapore 

Sweden 

Thailand 

United States 



233 


84 


4,917 


74 


209 


66 


5,543 


66 


147 


60 


3,118 


53 


90 


97 


2,546 


90 


132 


99 


4,973 


95 


99 


99 


2.515 


93 


291 


72 


3,228 


89 


199 


99 


7,610 


95 


189 


100 


^,522 


100 


224 


92 


5,025 


86 


77 


65 


1,420 


59 


269 


90 


10,874 


88 


201 


100 


4,520 


95 


185 


100 


4,520 


95 


69 


60 


1,461 


50 


96 


93 


3,780 


92 


119 


85 


2,519 


69 



^Sampled classes, not schools. 

SOURCE: Data ftom Intemaiional Association for the Evaluation of Educational Achievement, Studeni 
Achievement in Seventeen Countries (Oxford: I^rgamon Piiess, 1988), 96, 82. 



ERIC 



58 



79 



Table A. 11 

Sample size and response rates— schools and students: 

SecfHid International Science Study, 

Last year of secondary school (biology student) 



Schnols Stodi 



Educati(mal 
system 


Achieved 
saxnole 


rale 


A f^liiiMfiai^ 
AvIUCWQ 

MJlipfC 




Australia 


164 


83 


1,631 


72 


Canada (English) 


187 


65 


3,254 


53 


uiglano 


123 




884 




FiiHand 


43 


94 


1.652 


84 


Hong Kong (form 6) 


158 




5,960 


mm 


Hong Kong (fonn 7) 


114 


mm 


3,614 




Hungary 


71 




301 


mm 


Italy 


12 




147 




Japan* 


38 


95 


1.212 


90 


Norway 


52 




276 




Poland 


71 


100 


764 


45 


Singapore 


8 


100 


902 


84 


Sw&den 


119 




1.232 


mm 


United States 


43 


92 


659 


ri 



~Not Bvailalile. 
^Sampled classes, not schools. 



NOTE: Australia tested 29 items; United States tes^ 25 iteim; others tested 30 items. 

SOURCE; Data firon Intraiiatitmal Association fat the Evaluation of Educational Achievement, Student 
Acfuevement in Seventeen Countries (Oxford: Beigamon Press, 1988), 96, 84. 



SO 



Table A.12 

Sample size and response rat^ — schools and students: 

Second International Skience Study, 

Last year of seo>ndary school (chemistry students) 



Educational 
system 



Achieved Response 
san^te rate 
(percent) 



Achieved 
samg^ 



Response 

rate 
(pe3rent) 



Australia 
Canada (English) 
England 
Finland 

Hong Kong (form 6) 

Hong Kong (form 7) 

Hungaiy 

Italy 

Japan* 

NOTway 

Pbland 

Singapore 

Sweden 

United States 



164 


82 


1,177 


77 


179 


60 


2,923 


51 


123 




892 


mm 


44 


96 


971 


83 


158 




6,018 




114 




3,670 




56 




143 


mm 


24 




217 




43 


100 


1,468 


93 


46 




283 




71 


100 


765 


45 


8 


100 


945 


74 


119 




1,172 


mm 


40 


76 


537 


70 



-Not avail^Ie. 

*Saii^led classy not schools, 

NOTE: United States tested 25 items; mhm tested 30 items. 

SOURCE: Dm fmm International Associatitm tw the Evaluatira of Educati(mal ^hievement, Student 
Ax:hievemen$ in Seventeen Countries (Oxfofd: Peigfflmm Pre^ 1988), 85, 96. 



60 

til 



Table A. 13 

Sample size and response rat^ — schools and students: 

Second International Science Study, 

Last year of secondary school (physics students) 



Educaticmal 
system 


Achieved 
sample 


Response 
rate 

(pcfcoit) 


AchKved 
sample 


Rcsix>nse 

rafie 
(pciccnt) 


Australia 


163 


82 


1.073 


76 


Canada (English) 


181 


64 


2,766 


54 


England 


125 




917 




Finland 


42 


91 


810 


83 


Hong Kong (Form 6) 


158 




6,025 




Hong Kong (Form 7) 


114 




3,679 




Hungaiy 


75 




398 


mm 


Italy 


120 




1,766 




Japan* 


36 


92 


1,187 


89 


NcMway 


55 




443 




Fbland 


79 


100 


1,716 


91 


Singapore 


8 


100 


1,071 


82 


Sweden 


119 




1,156 




United States 


35 


76 


485 


64 



Sat availabte. 

^Sampled classes, not schools. 



NOTE: United St«es tested 26 items; Canada tested 29 items; others test^i item. 

SOURCE: Dau ftom International Association for the Evaluation of Educa^al Achieven^nt, Student 
Achievement in Sey^nteen Countries (Oxford: Ferganu}n Press, 1988), 96^ 86. 



ERIC 



61 



62 



Table A.14 

Sample size and response rates — schools and students: 
International Assessment of Educational Progress, 13-year-olds 
(mathematics proficiency) 



Educational 
system 



Schools 



Achieved R^ponse 
sample rate 



Students 



Achieved Response 
sanq>le rale 

(percoit) 



Canada (British Columbia) 
Canada (New Bninswick: English) 
Canada (New Bnms«nck: Fiench) 
Canada (Ontario: English) 
ClanuSa (Ontario: Ftm^h) 
Camda (Quebec: English) 
Canada (Quebec: I^mch) 
Irdand 
Korea 
Spain 

United Kingdom 
United States 





100 


3,025 


98 




95 


2,047 


92 




91 


1,548 


74 




96 


2,008 


94 




97 


2,075 


96 




97 


2,090 


98 




95 


2,186 


97 


99 


97 


2,253 


90 




94 


2,243 


98 


100 


89 


1,756 


98 


85 


70 


2,202 


94 




87 


905 


90 



-Notavailabte. 

NOTE: British Columbia tested both public and jvivate schools, but response rate fco- British Columbia reflects 
public schods <mly. 

SOURCE: Data from Archie E. Lapointe, Nancy A. Mead, and Gary W, Phillips, A World of Differences'. An 
International Assessment cf Mathematics and Science (Princeton: Educational Testing Service, January 1989), 
S4t SS« 



Table A.15 

Sample size and response rates— ^hools and students: 
International Assessment of Educational Progress, 13-year-old$ 
(science proficiency) 



Educational 
system 



Achieved 
sample 



IBs 

(peroent) 


Achieved 
sampte 


Response 

xaiB 
(percent) 


100 


3,025 


96 


95 


2.047 


93 


91 


U48 


73 


96 


2,008 


94 


97 


2,075 


96 


97 


2,090 


96 


95 


2,186 


97 


97 


2,253 


90 


94 


2.243 


98 


89 


1,756 


98 


70 


2.202 


94 


87 


859 


90 



Canada (British Cdumbia) 
Canada (New Bnmswick: English)^ 
Canada (New Bnmswick: French) 
Canada (Ontario: English) 
Canada (Ontario: Frmch) 
Canada ((^bec: English) 
Canada ((Quebec: Reirch) 
Beland 
Korea 
Spain 

United Kingdom 
United States 



-Not avail^Ie. 

*Sainpl«i dass^ not schools. 

NOTE: British Columbia tested both public and private schools, but respOTse rate ft» British Columbia reflects 
public sdwAs mly. 

SOURCE: Data ftom Archie E, Lapointe, Nancy A. Mead, and Gary W. PhUlips, A World ofDiffereneer, An 
Iruerruuional Assessment of Mathematics and Science (Princeton: Educatk»ial T«ting Service, Jaiuary 1989), 
84> 85f 



63 



Appendix B 

Mean Scores and 
Means Compared with United States 



65 



Table B.l 

Mean scores and means compared with United States: 

First International Matliematics Study, 13-year-olds (70 items) 



, Number of items cotrect 

EducaiHsial Standard Mean 

system and error of Standard compared 

rank by n»an Mean meani deviation with U,S.2 



Israel 



32.3 


0.47 


14.7 


32.2 


0.52 


16.9 


30.4 


0.53 


13.7 


26.4 


0.60 


9.6 


25.4 


0.58 


11.7 


23.8 


0.57 


18.5 


22.3 


0.64 


15.7 


21.4 


0.61 


12.1 


21.0 


0.70 


13.2 


18.9 


0.38 


12.3 


17.8 


0.28 


13.3 


15.3 


0.51 


10.8 



higher 



Japan 32.2 0.52 16.9 higiwr 

Belgium 30.4 0.53 13.7 higher 

Finland 26.4 0.60 9.6 higher 

Fedaal Republic of Germany 25.4 0.58 11.7 higher 

England 23.8 0.57 18.5 higher 

Scotland 22.3 0.64 15.7 higher 

Netherlands 21.4 0.61 12.1 higher 

France 21.0 0.70 13.2 higher 



Australia 18.9 0.38 12.3 same 

United States 17.8 0.28 13.3 U.S. 

Sweden 15.3 0.51 10.8 lower 

^Standard ar^a^dmed using <fesign effect specif»d by countiy in Husen, 1 58-61 . 
^Based on Bonfarom adjusted t-test f(x con^-scms with the United States. 

SOURCE: Dm from Tonten Husai, fnternatbnal Study of Achievement in Mathematics, Vol. II (New 
Yoric: John Wiley & Sons. Inc., 1967), 23; Vol. I, 158-61. 



ERIC 



67 

80 



Table B.2 

Mean scores and means compared with United States: 

First International Mathematics Study, Last year of secondary school 

(mathematics students— 49 items) 



Number of items cotrect 

Educational Stamiard Mean 

system and etiwof Standard conqsared 

lankbynsan Nfean mean' deviatxm with U.S.- 



Israel 

England 

Belgium 

France 

NetherlaiKls 

Japan 

Federal Republic of Germany 

Sweden 

So^land 

Finland 

Australia 

United States 



36.4 


1.35 


8.6 


higher 


34.6 


0.72 


12.6 


higher 


34.6 


0.89 


12,6 


high»' 


33.4 


0.80 


10.8 


higher 


31.9 


0.60 


8.1 


higher 


31.4 


1.00 


8.6 


higher 


28.8 


0.50 


9.8 


higher 


27.3 


0.68 


11.9 


higher 


25.5 


0.44 


10.4 


higher 


25.3 


0.65 


9.6 


higher 


21.6 


0.64 


10.5 


higher 


13.8 


0.51 


12.6 


U.S. 



^Standard ema calculio^ using design effect specified by rouimy in Husen, 158-61. 
^Based on Bonforoni adjusted t-test fx conqyarisons wUh Ae United Stat^. 

SOURCE: Data from Torsten Husen, International Study of Achievement in Mathenuuics, Vol. II (New 
Yoik: John Wiley & Sons, Inc., 1967), 23; Vol. 1, 158-61. 



68 



Table BJ 

Mean scores and means compared with United States: 

First International Matlienwti<» Study, Last year of secondary school 

(non-mathematics students— 58 items) 



Number of items correct 



Educational Standard Mean 

system and emorof Standard compared 

rank by mean Mean mean^ deviation wdth U.S 2 



Federal Republic of Germany 27.7 0.30 7.6 higher 

France 26.2 -- 9.5 

Japan 25.3 0.43 14.3 higher 

Netherlands 24.7 - 9.8 

Belgium 24.2 0.57 9.5 higher 

Finland 22.5 0.54 8.3 higher 

England 21.4 0.31 10.0 higher 

Scotland 20.7 0.37 9.5 higher 

Sweden 12.6 0.38 6.2 higher 

United States 8.3 0.36 9.0 U.S. 

"K(A available. 

^Standard emx calculated usij^ design effect by country in Husen, 1S8-61. 
^Based on Bonfernmi ^justed t-test for comparisons with the United States. 

SOURCE: Data from T(»:stai Huscn, International Study of Achievement in Mathematics. Vol. II 
(New York: Wiley, 1967), 25; Vol. 1, 158-61. 



ERLC 



69 

8S 



Table B.4 

Mean scores and means compared with United States: 
Second International Mathematics Study, 13-year-olds 
(46 core items— arithmetic) 





Number ci items correct 




Educatkmal 


Stai^lani 


Nfean 


system and 


error of Standard 


conqjared 


lankby mean 


Msan mean^ deviation 


with U.S 2 



Japan 

Netberiands 

Canada (British Columbia) 
Belgium (Flemish) 
France 

Belgium (French) 
Hungaiy 
Hong Kong 
Canada (Ontario) 
United States 
Scotland 
Israel 

England and Wales 

New Zealand 

Finland 

Luxembouig 

Thailand 

Sweden 

Nigeria 

Swaziland 



-Not available. 

^Stamlaid ennr calculfOed in Rc^itaiUe and Gaitten. MedKxl mx specif»d 
^Based on Bonfeironi adjusted t-test for comparisons with Ha United States. 

SOURCE: Data from David R(*itaille and Robert Garden, eds., The IntemaHoml Association for the 
E^maion of Education Achievement (lEA) Study of Mathenmtics II: CofUexts and Outcomes cf School 
Mathematics, Vol. U (Oxfonl: Pcrgamon Press, 1989), 105; U.S. Department of Education, Center for 
Education StatKtics, Rol«rt A. Garden, Second lEA Mathematics Study: Sampling Report (Washington, 
DC, March 1987), 1 15; U.S. Department of Education, National Center fw Education Statistics, Digest of 
Education StatbHcs (Washington, DC, 1989), 389. 



60.3 1.5 - higlar 

59.3 1.1 - higher 
58.0 1.3 - higher 
58.0 1.4 - higter 

57.7 1,3 - higho- 

57.0 1.8 -- same 

56.8 1.5 ~ same 

55.1 0.5 - same 

54.5 1.1 -~ same 

51.4 1.2 - U.S. 

50.2 0.5 - same 

49.9 1.5 - same 

48.2 0.9 same 

45.6 1.2 - lower 

45.5 1.3 lower 
45.4 0.4 - lower 
43.1 1.3 lower 

40.6 0.9 ~ lower 
40.8 1.3 - lower 

32.3 1.4 lower 



ERIC 



70 

89 



Table B.5 

Mean scores and means compared with United States: 
Second International Mathematics Study, IS-year-olds 
(30 core items— algebra) 



Edtotk»ial 
system and 
rank by mean 



Number nf items coircrt 



Mean 



Standard 
error of 
mean! 



Standard 
deviation 



Mean 
compared 
with U.S 2 



J^>an 60.3 

Fiance 55.0 

Belgium (Flemish) 52.9 

Netherlands 51.3 

Hungary 50.4 

Belgium (French) 49.1 
Canada (British Columbia) 47.9 

Israel 44.0 

Finland 43.6 

Hong Kong 43.2 

Scotland 42.9 

United States 42.1 

Canada (Ontario) 42.0 

England and Wales 40. 1 

New Zealand 39.4 

Thailand 37.7 

Nigeria 32.4 

Sweden 32,3 

Luxembourg 31.2 

Swaziland 25.1 



1.6 

0.9 

1.7 

1.2 

1.2 

2.0 

1.4 

1.6 

1,3 

0.8 

0.7 

1.2 

0.7 

1.1 

1.1 

1.0 

1.7 

0.8 

0.5 

1.5 



higher 
higher 
higher 
higher 
higher 
higher 
higher 
same 
same 
same 
same 
U.S. 
same 
same 
same 
same 
lower 
lower 
lower 
lower 



-Not availaWe. 

^StaiKteid enxsr cabnilated in Rt^itaille and Garden, Mediod m specified 
'^Based on Bonfinroni at^usted t-tesi for a>nq>ais(His with the United States. 

SOURCE; Data from David RobitaiUe and Robert Oard«i, eds., The Iniemational Association for the 
Evaluation tff Education Achievement (lEA) Study of Mathematics II: Contexts and Outcomes of School 
Mathematics. Vol. U (Oxfwd: Perjamon Press, 1989), 105; U.S. Depmment of Education. Center for 
Education Statistics, Robert A. Garden, Second lEA Mathematics Study: SampUng Report (Washington, 
DC, March 1987), 1 15; VS. Department of Education, National Center fot Education Statistics, Dieest of 
Education Statistics (Washington, DC, 1989), 389. 



ERIC 



71 



90 



Table B.6 

Mean scores and means compared with United States: 
Second International Mathematics Study, 13-year-olds 
(39 core Items — ^geometry) 



Number of items correct 

Educaticmal Standard Mean 

system and error of Standaid compared 

rank by mean \&an mean^ deviation with U.S.' 



Japan 57.6 

Hungary 53.4 

Netherlands 52.0 

Scotland 45.5 

England and Wales 44.8 

New Zealand 44.8 

Canada (Ontario) 43.2 

Finland 43.2 

Belgium (French) 42.8 

Belgium (Flemish) 42.5 

Hong Kong 42.5 
Canada (British Columbia) 42.3 

Sweden 39.4 

Thailand 39.3 

Fiance 38.0 

United States 37.8 

Israel 35.9 

Swaziland 31.1 

Nigeria 26.2 

Luxembourg 25.3 



1.3 




higher 


1.0 




higher 


1.0 




higher 


0.6 




higher 


0.8 




higher 


1.0 




higter 


0.7 




higher 


1.2 




higher 


1.5 




same 


1.1 




higher 


0.5 




higher 


1.2 




same 


0.8 


mm 


same 


0.9 




same 


0.8 




same 


0.9 




U.S. 


1.4 




same 


1.3 




lower 


0.8 




lower 


0.4 




lower 



-Not available. 

IStandaid error calculated in RobitaiUe and Garden. U&ibod not specified. 
^Based tm Bonferroni adjusted t-^t comparisons with the United States. 

SOURCE: Data from David Robitaifle and Robert Garden, eds., The International Association ft^- the 
Evaluation cf Education Achievement (lEA) Study of Mathematics lit Contexts and Outcomes ef School 
Mathematics, Vol. U (Oxford: Pergamon Press, 1989), 105; U.S. Department of Education, Center for 
Education Statistics, Robert A. Garden, Second lEA Mathematics Study: Sampling Report (Washmgton, 
DC, March 1987), 1 15; U.S. Department of Education, National Center for Education Statistics, Dtgest of 
Education StatisHcs (Washington, DC, 1989), 389. 



ERIC 



72 

91 



Table B.7 

Mean scores and means compared with United States: 
Second Intematkmai Mathematics Study, 13-year-olds 
(24 core items— measurement) 



. Number erf items ctmegt 

Eaw^tKmal Standand Mean 

system and error of Standard conqjaied 

rank by mean Mean mcani deviation with U.S 2 



^ 1.3 - higher 

Hungary 62.1 1.4 - higher 

Netherlands 61.9 1.0 - hilter 

Jancc 59.5 0.9 - ^hw 

Belgium (Flemish) 58.2 1.3 - higher 

Belgium (French) 56.8 1.5 - higher 

Hong Kong 52.6 0.4 -- higher 

Canada (British Columbia) 51.9 1.3 - higher 

Hnland 51.3 1.2 - higher 

Canada (Ontario) 50.8 0.9 - hShcr 

Luxembourg 50.1 0.4 - higher 

Sweeten 48.7 1.0 - higher 

England and Wales 48.6 0.9 - higher 

Sc<«land 48.4 0.7 - higter 

Thailand 48.3 1.1 - h«her 

Israd 46.4 1.2 - higher 

New Zealand 45.1 1.1 - higher 

United States 40.8 0.9 - U.S. 

Swaziland 35.2 1.3 lower 

Ni^ria 30.7 1.1 - lower 

-Notavail^de. 

^Sttrndaid errar cafeolattd in Robitaille and Gard»i. Method net specif^ 
^Based on BonfemHd adiusttd for conq>ariscHis with ti» United States. 

SOURCE; Data from David Robitaille and Robert Gafden, eds.. The International Assadathn for the 
Evaluation cf Education Achievement (lEA) Study €f Mathematics II: Contexts and Outcomes cf School 
Mathematics, Vol. U (Oxfbid: Peigamon P»ess, 1989), 105; U.S. Dq>atinrat erf Education, Center for 
Educatim Statistic, Robot A. Oaiden, Second lEA MaihanaUcs Study: SairqMng Reiwrt (Washington, 
DC, Mmch 1987), 115; U.S. Dejiaronent of Edwation, National Center ita Educ«ion Statistics, Digest of 
Education Statistics (Washington, DC 1989), 389. 



73 



Table B.8 

Mean scores and means coni|»red with United States: 
Second International Mathematics Study, 13-year-olds 
(18 core items — descriptive statistics) 

Number erf items coirect 

Educational Standard Mean 

system and emnrof Standard comjared 

rank by mean Nfean mcan^ deviation with U.S 2 



Japan 70.9 

Netherlands 65.9 
Canada (British Columbia) 61 .3 

Hungary 60.4 

Englami and Wales 60.2 

Scotland 59.3 

Belgium (Flemish) 58.2 

United States 57.7 

Finland 57.6 

France 57.4 

New Zealand 57.3 

Canada (Ontario) 57.0 

Sweden 56.3 

Hong Kong 55.9 

Bcl^um (French) 52.0 

Israel 51.9 

Thailand 45.3 

Luxembourg 37.3 

Nigeria 37.0 

Swaziland 36.0 



1 s 




hicher 


0.9 




higher 


1.3 




same 


1.4 




same 


0.9 




same 


0.5 




same 


1.5 




same 


1.1 




U.S. 


1.1 




same 


1.0 




same 


1.1 




same 


1.0 




same 


1.1 




same 


0.6 




same 


1.7 




same 


1.3 




lower 


l.O 




lower 


0.4 




lower 


1.3 




lower 


1.7 




lower 



-Nm availabte. 

iStandaid mat calculated in RobitaiUe and Garden. Mediod not specified. 
^Based cm B<»ifi»Toni adjusted t-l^t ftx comparison with die United Stat^. 

SOURCE: Data from David RobitaiUe and Robot Oardot, eds., The Inumational Association for the 
Evaluation of Education Achievement (lEA) Study ef Mathenauics II: Contexts and Outcomes of School 
Mathematics. Vol. n (Oxford: Pergamon Press. 1989), 105; U.S. Department of Educatira, Centw for 
Education Statistics, Robert A. Oar&n, Second lEA Matfxmadcs Study: SampUng Report (Washington, 
DC, March 1987), 115; U.S. Department of Education, Natiwud Center for Education Statistics, Digest of 
Education Statistics (Washiogttm, DC 1989), 389. 



ERIC 



74 

93 



Table B.9 

Mean scores and means compared with United States: 

Second International Mathematics Study, Last year of secondary school 

(17 items — number systems) 



. Percent of Items coiTCct 

Educational Standard Mem 

Systran and error of Standard con^ared 

rank by mean Meani noan^ deviation with U.S.3 



Hong Kong 78 1.5 ~ higter 

Japan 68 1.1 higher 

Sweden 62 0.8 - higter 

Englawl and Wales 59 0.8 - higher 

Fmland 57 1.1 - hShw 

New Zealand 51 1.4 - higher 

Belgium (Flemish) 48 1.1 - higter 

CanaJa (Ontario) 47 0.9 - higher 

Israel 46 1.5 - Wghcr 

Belgium (French) 44 1.5 - same 

Canada (British Columbia) 43 1.3 - same 

United States 40 1.1 - U.S. 

Scotland 39 1.1 - same 

Thailand 33 1.2 - lower 

Hungary 28 1.3 - lower 

-N(H available. 

^Data availabte nninded fo nearest wlK^e number. 

^Standard error calctilaiec) in Robitaille and Garden. Mettod not spedfled. 

^Based on Bonfennoni adjusted t-test for comparisons with the United States. 

SOURCE: Data from David Robitaille and Kcfbwt Garden, eds.. The Intemmional Association for the 
EvedutMon cfEduaaion Achievement (lEA) Study ef Mathematics II: Contexts and Omcoms if School 
Mathematics. Vol. II (Oxfjord: Pergamm Press, 1989), 13C; U.S. Department of Education, Center for 
Education Statistics, Robot A, Garden, Second lEA Mathematics Study: Sampling Report (Washington, 
DC, March 1987), 120; Curtis C. McKnight et al.. The Underachieving Cttrricidum (Champaign: Stipes, 
1989), 125. 



75 



Table B.IO 

Mean scores and means compared with United States: 

Second International Mathematics Study, Last year of secondary school 

(26 items — algebra) 



Pfereent of items correct 

Educational Standard Mean 

system and cm^of Standard compared 

rank by mean Mcani mean^ deviation with U.S.3 



Hong Kong 


78 


1.4 




higher 


Japan 


78 


1.0 




hig}^ 


Finland 


69 


0.8 




higt»r 


England ami Wales 


66 


0.6 




higher 


Belgium (Flemish) 


60 


1.2 




higher 


Swwicn 


51 


0.8 




higher 


Israel 


60 


1.5 




higi»r 


New Zealand 


57 


1.2 




higher 


Canada (Ontario) 


57 


1.0 




higher 


Belgium (French) 


55 


1.6 




higher 


Scotland 


48 


0.9 




higher 


Canada (British Columbia) 


47 


1.4 




same 


Hungary 


45 


1.5 




same 


United States 


43 


1.2 




U.S. 


Thailami 


38 


1.4 




same 



-Not available. 

^Data available rounds to nearest whole nund>er. 

^Statdaid error calculated in Rt^itaille and Gffldoi. Method ncH specified. 

^Based on Bonfemmi adjusted t-lcsl for comparisons with the United Slates. 

SOURCE: Data from David Robitaille and Robert Garden, eds., The International Association for the 
Evaluation of Education Achievement ilEA) Study ofMathemaucs II: Contexts and Outcomes ttf School 
Mathematics, Vd. 11 (OxftMd: Peisamon Press. 1%9), 130; U.S. Department of Edacati(m, Center ftM 
Education Statistics, Robert A. Garden, Second lEA Mathematics Study: Sampling Report (Washington, 
DC, M»ch 1987), 120; Curtis C. McKnightet al., The Underachieving Curriculum (Champaign: Stipes, 
1989), 125. 



ERIC 



76 

95 



Table B.ll 

Mean scores and means compared with Uniled States: 

Second International Mathematics Study, Last vear of secondarv school 

(26 items — geometry) 



, Percent of items correct 

Educational Standard Mean 

system and error of Standard companrd 

rank by mean Meani mcan^ deviation with U.S.^ 



Hong Kong 65 

Japan 60 

England and Wales 5 1 

Sweden 49 

Finland 48 

New 2iealand ^3 

Scotland 42 

Canada (Ontario) 42 

Belgium (Flemish) 42 

Belgium (French) 38 

Israel 35 

United States 31 

Hungary 30 

Canada (British Columbia) 30 

Thailand 28 



1.4 
1.1 
0.5 
0.5 
0.8 
1.0 
0.8 
0.7 
1.1 
1.3 
1.5 
1.0 
1.1 
1.2 
0.9 



higher 
higher 
higher 
higher 
higher 
higher 
higher 
higher 
higher 
higher 
same 
U.S. 
same 
same 
same 



-Not available. 

^Data availabte rounded to nearest whole number. 

^Standaid enor calculated in Rc^iiaiUe and Gar<^n. Method noi specified. 

^Based on Bonferroni juljusted i-iesi for comparisons with the United States. 

SOURCE: Data from David Rc*iiaille and Roben Garden, eds., 'Pie International Assoviation for the 
Evaluation cf Education Achievcnteni (lEA) Study of Mailicmaiics II: Contexts and Outcomes (^School 
Mathematics, Vol. II (Oxford; Pergamon Press. 1989), 130; U.S. Deparimeni of Education. Center for 
Education Statistics, Roben A. Garden, Second lEA Mathctnatics Study: Sampling Report (Washington, 
DC, March 1987), 120; Curtis C. McKnight el al.. The Underachieving Curriculum (Champaign: Stipes. 
1989), 125. 



ERIC 



77 

96 



Table B.12 

Mean scores and means compared with United States: 

Second International Mathematics Study, Last year of secondary school 

(46 items— elementary functions and calculus) 



Percent of items correct 

Educational Standard Meart 

system and enorof Standard con^^aied 

tank by mean Mean^ mean^ deviation with U.S.^ 



H<m$ Kong 


71 


1,9 




higher 


Japan 


66 


1.5 




higher 


England and Wales 


58 


0.6 




higher 


Finland 


55 


1.1 




higher 


Sweden 


51 


0.8 


mm 


higher 


New Zealand 


48 


1.1 




higher 


Canada (Ontario) 


46 


1.0 




higher 


Belgium (Flemish) 


46 


1.1 


mm 


higher 


Isniel 


45 


1.6 




higher 


Belgium (French) 


43 


1.4 




higher 


Sccdand 


32 


0.9 


mm 


same 


United States 


29 


1.2 




U.S. 


Thailand 


26 


0.8 




same 


Hungary 


26 


1.1 


mm 


same 


Canada (British Columbia) 


21 


1.0 




lower 



-Not available. 

^Data avail^le iDunded to mafest whole number. 

^Standard error calculated in Rd)itaiUe and Ganlen. Method not specified. 

^Based on BonfemMii adjusted t-test for con^arisons with the United States. 

SOURCE: Data firom David R(*itaiUe and Rt*ert Garden, eds., The Jntermuional Association for the 
Ewluaiion cf Educmion Achievement (lEA) Study of Mathematics 11: Contexts end Outcomes of School 
Mathematics, Vol. 11 (Oxford: PergOTon Press, 1989), 130; U.S. Department of Education, Center for 
Education Statbtics, Robert A. Garden, Second lEA Mathematics Study: Sampling Rei^rt (Washington, 
DC, March 1987), 120; Curtis C. McKnight et al.. The Underachieving Curriculum (Champaign: Stipes, 
1989), 125. 



ERIC 



78 



Table B.13 

Mean scores and means compared with United States: 

Fir^ International Science Study, 10-year-olds (40 core items) 







Number of items comet 




EdiKational 




Standard 




Nfean 


system and 




ciror of 


Standard 


conpaivd 


rank by mean 


Mean 


mean! 


(teviation 


with U.S.2 


Japan 


21.7 


0.31 


7.7 


higher 


Sweden 


18.3 


0.39 


7.3 


same 


Belgium (Flemish) 


17.9 


0.65 


7.3 


same 


United States 


17 7 


0.30 


9.3 


IT C 
U.S. 


^nnland 


17.5 


0.54 


8.2 


same 


Hungary 


16.7 


0.28 


8.0 


same 


Italy 


16.5 


0.31 


8.6 


same 


England 


15.7 


0.34 


8.5 


lower 


Netherlands 


15.3 


0.45 


7.6 


lower 


Fedml Republic of Germany 


14.9 


0.43 


7.5 


lower 


Scotland 


14.0 


0.43 


8.4 


lower 


Belgium (French) 


13.9 


0.62 


7.1 


lower 



^Slandffid eiicr derived fro. i Feaker (VdeFF = 2.4). 

^Ba^ on Bonfienoni adjusted t-test for comparisons with the United Sates. 



NOTE' Mean saxes not available fw five sysienu— Chile. India. Iran, Israel, Uniland. Scene may not have 
been ca lr ti l ated , or score may not have beat puMished, system may not have i»rtici|»ted in the 
achievement te^ survey. 

SOURCE: Data from L.C. Comber and John P. Keevcs, Science Education in Nineteen Countries (New 
Yoric: John Wiley & Sons. Inc., 1973), 159; Gilbert F. Puakcr. An Empirical Study cf Education in 
Twenty-One Countries: A Technical Report (New York: John Wiley & Sons, Inc., 1975), 36, 37. 



93 



Table B.14 , ^ 

Mean scores and means compared with United States: 

First International Science Study, 14-ye8r-olds (80 core items) 



Number of items correct 

Educational Standard Mean 

system wid cntM-of Standard compared 

rank by mean Mean mean^ deviation with U.S .2 



Japan 
Hungary 

New Zealand 

FcdswA Republic of Germany 
Swedtm 

United States 

Scotland 
England 

Belgium (Flemish) 

Finland 

Italy 

NeUierlands 
Belgium (French) 



31.2 


0.81 


14.8 


higter 


29.1 


0.36 


12.7 


higher 


24.6 


0.44 


13.4 


hig)^ 


24.2 


0.70 


12.9 


higher 


23.7 


0.58 


11.5 


higher 


21.7 


0.56 


11.7 


same 


21.6 


0.34 


11.6 


U.S. 


21.4 


0.77 


14.2 


same 


21.3 


0.59 


14.1 


same 


21.2 


0.84 


9.2 


same 


20.5 


0.53 


10.6 


same 


18.5 


0.28 


10.2 


lower 


17.8 


0.68 


10.0 


lower 


15.4 


0.89 


8.8 


lower 



iSiaralanl cmx derived from Ptsakcr (Vdeff = 2.4). 

^Based on Bonfcmmi aljusicd t-lcsi for comparisons wiUi ihc Uniicd Stales. 

NOTE: Mean scoms not available for five systems— Chile. IntUa, Iran, Isi^l, TTiailand. Score may noi have 
been calcuUaed. ot score may not have been published, or system imy not have panicij^lcd in the 
achievement test survey. 

SOURCE: Data from L.C. Comber and John P. Kcevcs, Science Education in Nineteen Couniries (New 
York: John Wiley & Sons, Inc., 1973), 159; Gilbert F. PCakcr, An Empirical Study of Education in 
Twenty-One Countries: A Tzchiucal Report (New Ywk: John Wiley & Sons. Inc., 1975), 37. 



80 



Table B.IS 

Mean scm'es and means coni|iarefi with United States: 

First International Science Study, Last year of secondary school 

(60 core items) 



EducaticHial 
system and 
rank by mean 



Number of items correct 



Mean 



Standard 

errrarof 
meani 



Standard 
deviatiai 



Mean 
compared 
with U.S 2 



New Zealand 

Federal Republic of Geimany 

Awaralia 

Netherlands 

Scc^land 

England 

Hungary 

Finland 

Sweden 

France 

Belgium (Flemish) 
Italy 

Belgium (French) 
United States 



29.0 


0.63 


11.6 


26.9 


0.48 


8.9 


24.7 


0.40 


10.7 


23.3 


0.78 


U.l 


23.1 


0.80 


12.1 


23.1 


0.58 


11.5 


23.0 


0.40 


9.0 


19.8 


0.55 


9.8 


19.2 


0.45 


10,2 


18.3 


0.35 


8.7 


17.4 


0.90 


8.1 


15.9 


0.16 


8,8 


15.3 


0.54 


7.9 


13.7 


\m 


9.5 



higher 
higher 
higher 
higter 
highor 
higher 
higter 
higher 
higher 
higher 
same 
same 
same 
U.S. 



^Standard mor derived fiom Pesker (VdEFF » 2.4). 

^Based <mi Bonfemmi al^i^ t-iesl for comparisons wiih the United States. 

NOTE: Mean scores not available for five systems— Chile. India, Iran, Israel, Thailand. Score may not have 
been calculated, or score may not have been puUished, or system may not have i^rticipaied in ihc 
achievemait te^ survey. 

SOURCE: Data from LC. Comber and John P, Kecves, Science Educaiion in Nineteen Countries (New 
York: John Wiley & Sons, Inc., 1973), 159; Gilbert F. Pieaker, An Empirical Study cf Education in 
Twenty-One Countries: A Technical Report (New York: John Wiley & Sons, Inc., 1975), 37. 



O 

ERIC 



81 



100 



Table B.16 

Mean scorn and means compared with United States: 

Second International Science Study, 10-year-olds (24 core items) 



Educational 
system and 
rank by mean 



Number of items correct 



Mean 



Standard 
error of 
nwan^ 



Standard 
deviation 



Mean 
compared 
with U.S 2 



Japan 

Korea 

Finland 

Sweden 

Hungary 

Canmla (English) 

Italy 

United States 

Australia 

Norway 

Poland 

England 

SingapOTB 

Hong Kong 

Philippines 



15.4 


0.07 


4.0 


higher 


15.4 


0.16 


4.2 


higher 


15.3 


0.15 


4.0 


higher 


14.7 


0.16 


4.0 


higher 


14.4 


0.23 


4.5 


higher 


13.7 


0.13 


4.3 


same 


13.4 


0.26 


4.7 


same 


13.2 


0.18 


4.6 


U.S. 


12.9 


0.18 


4.5 


same 


12.7 


0.30 


4.1 


same 


11.9 


0.16 


4.5 


lower 


11.7 


0.17 


4.5 


lower 


11.2 


0.18 


4.1 


lower 


11.2 


0.20 


4.2 


lower 


9.5 


0.16 


4.5 


lower 



^Standard emx' jacknifed, see lEA, 23. 

^Based on Bonfeironi adjusted Mest f(X comparisons with the United St^s. 

SOURCE: Data from Intonaticmal Associatim for the Evaluation of Educational Achievement, Student 
Achievement in Seventeen Coururies (Oxford: Peigamon I^ss, 1988), 81, 96. 



82 



iOI 



Table B.17 

Mean scores and means compared with United States: 

Second International Science Study, 14-year-olds (30 core items) 



Number of items correct 

Educational Standard Mean 

system and eirorof Standard compared 

rank by mean Mean mean* deviation withU.S.^ 



Hungary 21J 

Japan 20.2 

Netherlands 19.8 

Canada (English) 18.6 

Finland 18.5 

Sweden 18.4 

Korea 18.1 

Poland 18.1 

Norway 17.9 

Australia 17.8 

England 16.7 

Italy 16.7 

Singapore 16.5 

United States 16.5 

Thailand 16.5 

Hong Kong 16.4 

Philippines 1 1 .5 



0.26 


4.7 


higher 


0.09 


5.0 


higl^r 


0.26 


5.1 


higher 


0.17 


4.7 


higher 


0.13 


4.2 


higher 


0.22 


4.9 


higher 


0.15 


4.6 


higher 


0.22 


5.2 


higher 


0.16 


4.7 


higher 


0.19 


4.9 


higher 


0.22 


4.9 


same 


0.28 


5.0 


same 


0.28 


4.9 


same 


0.27 


5.0 


U.S. 


0.22 


4.1 


same 


0.25 


4.5 


same 


0.20 


4.6 


lower 



^Standaid erm- jacknifed, see lEA, 23. 

'^Based on Bonferront adjusted t-test for comparisons with the United States. 

SOURCE: Data from Internationa! Associatiim for the Evaluation of Educational Achievement, Student 
Achievement in Seventeen Countries (Oxford: Pergamon Press, 1988), 32. 



83 



Table B.18 

Mean scores and means compared with United States: 

Second International Science Study, Last year of secondary school 

(30 core items — biology) 



Educational 
system and 
ranK.by mean 



Number of items correct 



Mean 



Standard 
ertorof 
mean* 



Standard 
deviaticm 



"Mean 
con^xarcd 
with U.S.2 



Singapore 
Englj&id 
Hungary 
Poland 

Hong Kong (form 7) 

Norway 

Finland 

Hong Kong (form 6) 

Sweden 

Australia 

Canada (English) 

Japan 

Italy 

United States^ 



66.8 


mm 


12.8 


63.4 


0.24 


13.1 


59.7 


0.37 


13.5 


56.9 


0.29 


12.9 


55,8 


0.28 


16.8 


54.8 


0.33 


15.0 


51.9 


0.16 


12.8 


50,8 


0.27 


14,8 


48.5 


0.23 


15.8 


48.2 


0.15 


13.9 


45.9 


0.20 


14.0 


46.2 


0.48 


15,1 


42.3 


1.05 


14.1 


37.9 


0.41 


15.4 



higher 
higher 
higher 
higher 
higher 
higher 
higher 
higher 
higher 
higher 
higher 
higher 
U.S. 



-Not available, 

^Standard ent»- jacknifed, see lEA, 23. 

'^Based m Boahnom ^justed t-tesi for comparisons with the United States. 

^United States test core 25 iiemst and Australia tested 29 items, as compared with 30 items in all other 

ccMintries. 



SOURCE: Data ftom bitematicmal Associaiirai for the Evaluation of Educational Achievement, Student 
Achievement in Seventeen Countries (Oxford: Pergamon Press, 1988), 51, 



M i03 



Table B.19 

Mean sciNres and means coin|»red witb United States: 

Second International Science Study, Last year of secondary school 

(30 core items — chemistry) 



Educational 
system and 
rank by mean 



Number of items ccmnect 



Nfcan 



Standard 
error of 
mean^ 



Standard 
deviation 



Mem 
compared 
with U.S.2 



Hong Kong (form 7) 

England 

Sing^xne 

Hong Kong (form 6) 

Japan 

Hungary 

Aus&alia 

Poland 

Norway 

Sweden 

Italy 

United States^ 

Canada (English) 
Hnland 



77.0 


0.43 


17.4 


69.5 


0.29 


17.2 


66.1 




17.4 


64.4 


0.40 


17.0 


51.9 


0.86 


22.0 


47.7 


0.70 


18.3 


46,6 


0.31 


18.8 


44.6 


0.44 


17.1 


41.9 


0.37 


16.8 


40.0 


0.23 


16.6 


38.0 


1.45 


23.4 


37.7 


0.67 


18.3 


36.9 


0.31 


16.0 


33.3 


0.26. 


13.7 



higher 
higher 

higter 
higher 
higher 
higher 
higher 
higher 
higher 
same 
U.S. 
same 
lower 



"Nmavailabte. 

^Standard erm jacknifed, see lEA, 23. 

^Based m BonfenrDni adjusted t-te$t for comfarisons with the United States. 
^United States test core 25 items, as romiwed with 30 items in other countries. 



SOURCE: Data from InlmiaticHial Associ^m fw the Evaluatkin of Educatioiml Achievement. Student 
Achievement in Seventeen Countries (Oxford: Pergamon Press, 1988), 52. 



85 iO'i 



Table B.20 

Mean scores and means compared with United States: 

Second Inteniational Science Study, Last y^r of secondary school 

(30 core items — physics) 







Number of items coirect 




Educational 




Standard 




Mean 


system and 




error of 


Standard 


compaied 


rank by mean 


Nfean 


mean* 


deviation 


with U.S.2 


Hong Kcn$ {form 7) 


69.9 


0.38 


14.4 


higter 


Hong Kong (fonn 6) 


59.3 


0.34 


14,7 


higher 


England 


58.3 


0.20 


14.9 


higher 


Hungary 


565 


0.50 


17.2 


higher 


Japan 


56.1 


0.58 


17.2 


higher 


Singapore 


54.9 




13,2 




Norway 


52.8 


0.33 


15.6 


higher 


Poland 


51.5 




17.2 




Australia 


48.5 


0.21 


15.1 


higher 


United States^ 


45.5 


0.53 


15.8 


U.S. 


Sweden 


44.8 


0.18 


14.9 


same 


Canada (English) 


39.6 


0.20 


14,6 


lower 


Finland 


37.9 


0.27 


13.8 


lower 


Italy 


28.0 


0.25 


12.9 


lower 



--Not available. 

^Stamlard enw j«dmifed« see lEA, 23. 

^Based m Bmfemmi adjusti^S t-test for comparisons with the United States* 
%mted States test core 26 items, as against 30 items in other rountries. 



SOURCE: Data from Intmiatscmal Association for the EvaluMi(»i of Educational Achievement* Student 
Achievement in Seventeen Countries (Oxford: Piergamon Press, 1988), 53. 




Table B.21 

Mean scores and means compared with United States: 
International A^e^ment of Educational Progress, 13-year-olds 
(63 items — mathematics proficiency) 



. , Proficiency scnn^. 
Educational IVfcan 

system and Standard compai«d with 

rank by mean Mean^ erroi^ u.S.3 



Korea 

Canaia (Quebec: French) 
Canada (British Columbia) 
Canada (Quebec: English) 
Canada (New Brunswick: English) 
Canada (Ontario: English) 
Canada (New Brunswick: French) 
Spain 

United Kingdom 
Ireland 

Can»ia (Ontario: French) 
United States 

^Scorc based on scale ranging from 0 to 1.000, with mean of 500 and standani deviation of 100. 
'^StOKlaid enor jackmflExL 

^Based on Btmferroni adjusted t-tesl for comparisons with the United States. 

SOURCE: Data from Archie E. Lapointe, Nancy A, Mead, and Gary W. Phillips, A World of Differences: 
Interruuional Assessment of Educational Progress (Prinreion: Educational Testing Service, 1989), 14, 84. 



567.8 2.7 higher 

543.0 3.1 higher 

539.8 2.2 higher 

535.8 2.0 higher 

529.0 2.6 higher 

516.1 3.1 higher 

514.2 3.3 higher 
511.7 4.6 higher 

509.9 3.5 higher 

504.3 3.7 higher 
481.5 2.7 same 
473.9 4.5 IIS 



87 

106 



Table B.22 

Mean scores and means compared with United States: 
International A^ssment of Educational Progress, 13-year-olds 
(60 items— science proflciency) 



Proficiency score 

Educational Mean 

system and Standard compared with 

rank by mean Mean^ enor^ U.S.^ 



Canada (British Columbia) SS 1.3 2.1 higher 

K«ea 549.9 2.9 higher 

United Kingdom 519.5 3.7 higher 

Can^ (Quebec: English) 515.3 2.8 higher 

Can^ (Ontario: English) 514.7 2.7 higher 

Canada (Quebec: French) 513.4 3.3 higher 

Canada (New Brunswick: English) 510.5 2.7 higher 

Spain 503.9 4.3 higher 

United States 478.5 3.5 U.S. 

Ireland 469.3 3.5 same 

Canada (Ontario: French) 468.3 2.2 same 

Canada (New Bmnswick: French) 468.1 3.9 same 



^Score based on s:ale ranging from 0 to U000« with a mean of 500 ami standard deviatira of 100. 
^Stai^aid anM* jacbnfed. 

-'Based on Bonfemmi adjusted t-test for comparisons with the United States. 

SOURCE: Data from Archie E. Lqwinte, Nancy A. Mead, and Gary W. niillips, A World cfD^erences: 
InternafhnaJ Assessment of Educaiio/uU Progress (Prin(%tt»: Educational Testmg Service, 1989), 36, 84. 



ERIC 



88 

107 



Appendix C 
Secondary School Retention Rates 



89 

los 



Secondary School Retention Rates 



The countries participating in the international achievement studies have distinct national 
policies wift tegaid to advancing studoits through the secondary educational system. The United 
States and several other countries attempt to enroll, retain, and graduate as many secondary 
school age students as possible. In this country, almost all secondary students attend 
comprehensive high schools. Although some other countries have high enrollments of this age 
group, students may attend any one of several types of learning institutions, only some of which 
arc designed around academic curriculum. Still other countries significantly limit access to 
academic secondary schooling programs. For purposes of the international assessments, groups 
of students attending particular types of institutions may be excluded from the design sample, 
and therefore, countries may not be sampling comparable pools of students. The result is neither 
a representative sample of the age cohort, nor a representative sample of students in any kind of 
school during the last year of secondary school. 

With the exception of figure C5, data in the following set of figures were drawn from the 
international survey reports themselves. They are not consistent because retention estimates may 
not have been calculated on the same basis from study to study or country to country. The most 
recent data (figure C J), from the Organization for Economic Cooperation and Development 
(OECD), may more closely reflect current trends. Generally, however, the data on student 
retention patterns must be viewed with considerable caution because the estimates from country 
to country and study to study may not be predicated on the same sets of assumptions. 



91 

103 



Figure C.1— Estimateil percentage of age groupi enrolled ftill time in the last 
year of secondary school: First International Mathematics 
Study, 1963^4 




0 10 20 30 40 50 60 70 80 90 100 
Estimated percent enrolled Ml dme 



'The age at which students typically attain the last year of sccoodaiy schxxA varies from country to country, raaging 
from 17 to 20 years. 

SOURCE: Data from Ttmten HuMn, IiUernational Study of Achievement in Mathematics, Vol. II (New York: 
John WUey A Sons. Inc^ 1967), table 3.M. 



92 lit) 



Figure C^— Estimated percentage of age group^ enrolled in last year of 
Mcondary school: First International Science Study, 1969 




Esdraated percent enrolled 

^TIw it wWch ttudeotf typic««y the toil yw of feconday tdtool vtriei from conatry to counliy, twging 
firom 17 to 20 ynn. 

2TW«flgB«i«fbrthelBnnfa>ilgr»deofiliefecood«y»dK^ botti New 

ZejJmd md ScotUiA it i» pof&le to proceed to unhrenity to tl» 

3TOi figWB U for the OymnMU only; there tst naoy stwtaiti of thii i«e in other ichooli ntch as higher technical 
schools. 

SOURCE Data from UC Comber and John P. ICeeve»,Scie«e£<i«^ 
St SoDM, iBC^ 1973), table 4.1. 



ERIC 



93 

Hi 



Figure C.3— Estlnmted percentage of ace group> enrolled in last year of 
seiwiidary school: Second International Mathematics Study, 



1980-82 



Belgium (Flemish) 
Canada (British Cdumbia) 
Canada (Omarb) 
Ei^and^ 
Finland 
Hungary 
Israel 



NewZedami 

Sweden 
United States 




t r— T p 

30 40 50 60 70 
Estimated percent enroUed 



r 

80 



90 100 



lTto*20 typically attain the last year of secondoiy school varies from couofry to country, nnging 

^Includes data ftom Wales. 

SOURCE: Data ftom Kenneth Travers, The Second International Mathematics Study: Overview of Major Findinjw" 
(unpubUshed paper, Chanipalgn-Uri>ana: University of Illinois, 1986). table 4.2.1. 



94 



Figure C*4 — ^Estimated percentage of age group* in test year of secondary 
schocrt: Second International Science Study^ 1983--86 




1^ 
10 



20 30 40 so 60 70 
Estin^Ged percent eioolled 



1^ 
80 



1^ 
90 



100 



^Tbe ige it whteh stiMlenis typically attain Use last year of secondtiy school viries from country to oou&try, laiiging 
fiom 17 to ^ years. 

^Data for Ei^llsb-qwakiqg ttwlnts only. 

^Ii^Itiiles an estfanated 18 peimit of 9^ ggorsp attending vocational sdmls. 
^Incbules sn estiniated 22 percent of 1^ gronp attetKUpg vocatioi»l scliools. 
^Exdodes voctttonal school enrollnmits. 

^In Swedm, 90 peiwat of the 1^ group m eniolM in upper secondary edu»tfa»i (grades 10-12), and 80 pment 
con^]^ secoMlary education (grade ll),Tbl]terapei^t of tte age gionpaie enroll^ in science tiKks and IS 
pexcral in non-sciewe tracts in Graite 12. The lemaittler ttfce a 2*year voca&mal or general trade aiul school after 
Grade 11. 



NOTES: No d^ avail Ale for Koiea; Data in this Qgoit may dlfihr ftimt table D. 10 due to Ae pvltaiintfy nature of the 
publteatkm (see source betow) from yMcb these estim^ m drawn. 

^URCE: Iteta fiom btfeniational Amdatfan fin* Rvaloato of Educalim Ac^vement (lEA), Scimu:g Achievement 
in Semt^n Caw»ies: A Prelimifwy R^art (OxfimI: Fogamm Ftess, 1988), t^le IB* 



95 

113 



Figure CJ — Estimated percmtage of 17-year-olds enrolled in school fkill time 
or part time at the secondary level: 1987-88 




0 10 20 30 40 so 60 70 80 90 100 
Estimated peimit enrolled full tmte of pan tunc 



h986-87 data. 

^Doea i»>t ioclode IT-year-okU wto happen to be ouolled in postsecoodaiy studki. 

SOURCE: from Onganixitkn lot Eanmiiic Coopenrtim and Devebqnuent (OECD), Edueatitm in OECD Counties: 
l9S7-8», A Com^ndmm (Paris: OECD. 1990), table 4Z 



ERIC 



96 



X14 



Appendix D 



Age Distributions and 
Related Characteristics of Test Talcers 



97 115 



Table D.l— Mean age and standard deviation of the age distribution of 
13-year-old-9ample: First International Matberoatics Study 



COuntiy 


Nfeanage^ 


StamJanl <teviation^ 


Australia 


13K)3 


7 7 


Belgiuin 


14:00 


8 8 


Enfand 


14.-04 


4.2 


HiUand 


13:11 


7.3 


France 


13K)7 


7.8 


Federal Republic of Gmnany 


13:(» 


6.6 


Netheriands 


13:01 


11.6 




13:11 


5.6 


Japan 


l3i)S 


3.4 


Scotland 


14KX) 


5.4 


Sweden 


13:08 


4.9 


United States 


14:00 


6.8 



^Nfeffii age in years and immths. 
^Standard itevi^km in nK^tths. 



SOURCE: Data from T. Husm. International Study (^Achievement in Maihematics: A Ctmparison of 
Twelve Countries, Vol. I (New York: John Wiley & Sons. Inc.. 1967), 270-73. 



* 116 



Table D.2— Mean age and standard deviation the age distribution 
last-year secondary sample (inatlieniatics students): 
First Intemational Mathematics Study 



Country 


Mean age* 


Standard deviation^ 


Au^ialia 


nm 


9.2 


Belgium 


18K)1 


11.6 


England 


17:11 


7.5 


Finland 


19.-01 


10.6 


France 


18K)7 


13.7 


Federal Republic of Gennany 


19:10 


8.4 


Netheriands 


18K)2 


11.7 


Israel 


ism 


8.5 


Japan 


17:08 


3.6 


Scotland 


17.-06 


8.0 


Sweden 


19.-07 


10.9 


United States 


17:09 


6.3 



^Mean age in years and mOTihs. 
Standard deviatira in nK»iihs. 



SOURCE: Data from T. Husen, International Study cf Acfufvenunt in Mathematics: A Comparison of 
Twelve Countries, VoL I (New Ywfc John Wiley & Sons, Inc., 1967), 270-73. 



ERIC 



100 



Table D.3— Mean age and standard deviation of tbe age distribution of 
Uist-year secondary sample (non-matbematics students): 
First International Mathematics Study 

Mean agc^ Standaixi deviation^ 



18:00 


11.2 


17:11 


6.8 


19.-02 


10.8 


18:09 


12.8 


i9m 


8.8 


18K)7 


U.3 


17:08 


3.7 


nm 


6.2 


19.-07 


11.3 


17:10 


7.3 



Belgium 
En^and 
Hnland 
France 

Fedaal Republic of Germany 
Netherlands 
Japan 
Scotland 
Sweden 

United States 

*Mcan age in years and hkhiUis. 
^Sianitod deviaiion in months. 

SOURCE- Data from T. Husen, Internationai Study of Achievement in Mathematics: A Comparison cf 
Twelve Countries. Vol I (New Yoit: John Wiley & Sons, Inc., 1967), 270-73. 



O 101 . 

er|c 



Table D.4— Percentage of 10-year-old rnnple In different grades: 
Fir^ International Science Study 



PBTccniof I^coJtof Peiccntof ftrcentcrf 
Kfesn sain]^in sample in sample in samptein 
Counny age* gra&3 gnilcA grwJcS grate 6 



Belgium (Ftemish) 
Belgium (French) 
ddte 
England 

FeOBial Rqniblic 
of Gcmiany^ 
Finland 
Hungary 
India 

Italy 
Japan 

Ned^lands 
Scotland 
Sweden 
Hiuland 
United States^ 



11:00 


0 


1 


55 


1 1 
11 


10:06 


0 






A 


10K15 


15 


38 


41 


6 


10K)6 


0 


0 


48 


52 


10:05 


8 


44 


51 


1 


10K)6 


23 


77 


0 


0 


10:07 


0 


71 


29 


0 


10:07 


19 


30 


30 


13 


10:03 


3 


78 


19 


0 


10:07 


0 


4 


96 


0 


10:05 


0 


0 


100 


0 


10:06 


5 


37 


58 


0 


10:06 


0 


1 


35 


63 


10:05 


47 


53 


0 


0 


10:07 


2 


52 


42 


3 


10K)7 


2 


37 


66 


0 



^MeanageinyeaisamlnKWfhs. 

^lHss^ jnovided suins toovn 100 pot^ 

NOT& May not sum to 100 percent because sonme sam|ried ^udents fall outside grades iqwted 
in taMe. 

SOURCE: Data from L.C. Comber ami John P. Keeves, Science Education in Nineteen Countries (New York: 
John Wiley & Sons. 1973). 48. 



erJc 



102 

119 



Table DJ^Percefitage of 14.year-old sample in diffTerent grades: 
Vir^i International Science Study 



Qnmtiy 



Mean 
age* 



Percent of Percent of Percent of P^centof Percent of 
sample in sample in sample in sample in sam]^in 
grade6 gnde? grsteg graife9 grade 10 



Australia 14.-05 0 

Belgium (Flemish) 15K)0 0 

Belgium (French) 14K)7 0 

Chifc 14K)6 20 

England 14:07 0 
Feosial Rqniblic 

ofGennany 14.-05 2 

Hnland 14,-06 4 

Hungaiy 14K)5 0 

India um 14 

Iran 14KM 0 

Italy 14.-08 0 

Jspm 14:05 Q 

Nethoiands 14K)6 0 

New Zealand 14:06 0 

Scodand 14.-07 0 

Sweden 14.-06 0 

Thailand 14:06 1 

United States 14.-07 0 



Mean age in years and months. 
NOTC: May not sum to 100 because some sampled students may fall outside grades reported in table. 



n 




41 


52 


n 

V 


7 
/ 


OO 


3 


n 

V 


33 


05 


0 


14 


28 




A 


0 


0 


48 


52 


9 


44 


45 


0 


36 


60 


0 


0 


0 


77 


23 


0 


28 


37 


13 


0 


4 


93 


3 


0 


0 


45 


53 


2 


0 


0 


100 


0 


18 


71 


10 


0 


0 


0 


26 


72 


0 


1 


45 


54 


49 


51 


0 


0 


4 


33 


55 


6 


2 


26 


72 


0 



103 120 



Table D.6— Mean age of 13-year-old sample': 

Second !nt«matl(»al Mathematics Study 



Country Mean ag^ - Standard deviation' 



Belgium (Flemish) 
Belgium (French) 
Camda (Bridsh Columbia) 
Canada (Ontario) 
England and Wales 
Fiidami 
France 
Hong Kong 
Hungary 
Isr^l 
Japan 

Luxembourg 
Netherlands 
Nigeria 
New Zealand 
Scodaod 
Swaaland 
Sweden 
Thailand 
United States 

^Students m grade where n^jority has attained age 13.00 to 13: 1 1 years by the middle of the school year. 
^Meani^ in and immths. 
^Standaid (teviaiion in months. 

SOURCE' Data ftom David Robitaille and Robert Gaiden, eds.. The IntematioRal Association for the 
Evaluation of Education Achievement (lEA) Stiufy cf Mathematics U: Contexts and Outcomes of School 
Mathematics, Vol. n (Oxford: Pergamon Pre^ 1989), 64. 



\A4Yy 


s 




11 




6 


l*f«Ul 


7 




4 








s 


13K)2 


11 


14K)2 


13 


14:00 


5 


n.'OS 


4 


14K)5 


9 


14.-04 


8 


16:07 


38 


14:00 


5 


14.-00 


4 


15K)7 


23 


13K)9 


4 


14:02 


9 


14:01 


6 




104 



3 4- 



Table D.7 — Mean age of last-year secondary sample: 
Second International Mathematics Study 



Counny Meanage^ Standard deviation^ 



Belgium (Ftemish) 

Bel^um (French) 

Canada (British Columbia) 

Canada (Ontario) 

England and Wales 

Finland 

Hong Kong 

Hungaiy 

Israel 

Jqian 

New Zealand 
Scotland 
Sweden 
Thailand 
United States 



18K)1 


10 


18.-04 


11 

M m 


17:09 


6 


18.-0S 


14 


18:01 


4 


18:06 


6 


18.-05 


12 


18.'01 


4 


17K)9 


5 


18:01 


4 


nm 


6 




7 


19:02 


9 


18:02 


9 


17:08 


7 



*Mean age in years and immtfis. 
^Standard deviation in months. 

SOURCE: Data from David Robitaille and Robert Caiden, eds.. The /memadonal Association for the 
Evaluauon of Education Achievement (lEA) Study of Mathematics II: Contexts and Outcomes of School 
Mathematics, Vol. II (Oxford: Pergamon Press, 1989), 64. 



105 



Table D.8— Mean age of lO-ytfar-iHil Mmpie and selected 
Mhooling characteristics: Second Intemaaonal 



Oountxy 



Age 

altering 
formal school 



Grade 
tested 
in stiKiy 



Feicentin 
school 



Mean 
age^ 



deviation^ 



Australia 

Canada (English) 

England 

Finland 

H(»)g Kong 

Hungary 

Italy 

Japan 

Kf»ea 

Norway 

PhiUi^ines 

Pbland 

Singapore 

Sweden (A) 

Sweden (B) 

United States 



6 


4.5,6 


99 


10:06 


3.3 


6 


s 


99 


11:01 


7.1 


5 


s 


99 


lom 


3.6 


7 


4 


99 


10:10 


4.1 


6 


4 


99 


10:05 


9.8 


6 


4 


99 


10K)3 


5.2 


6 


5 


99 


10:09 


5.2 


6 


5 


99 


10K)7 


3.5 


6 


5 


99 


11:02 


7.4 


7 


4 


99 


10:11 


4.0 


7 


5 


97 


11:01 


11.3 


7 


4 


99 


10:11 


5.4 


6 


5 


99 


10:10 


4.9 


7 


3 


99 


9:10 


3.7 


7 


4 


99 


10:10 


4.1 


6 


5 


99 


11:03 


6.9 



^Mean age in years and months. 
^Standard <teviatkm in months. 

SOURCE: Data from T. N. Posilelhwaile, Second International Science Study, VoL II Draft (Hamburg, July 
1990), 6, 7. 



106 



123 



Table D.9 — ^Mean age of 14-year-olii sample and selected 

schooling characteristics: Second International Science Study 





Grade 


Bncentin 


Mean 


StaiKlBnl 


Country 


tested 


school 


age' 


deviation^ 


Australia 


8,9,10 


98 


14.-0S 


3.3 


Canada (English) 


9 


99 


15.-00 


6.1 


England 


9 


98 


14.-02 


3.6 


Bnknd 


8 


99 


14:10 


4.1 


Hong Kong 


8 


99 


14.-07 


10.9 


Hungary 
fialy(A) 


8 


92 


I4m 


4.7 


8 


99 


13:11 


8.6 


Italy (B) 


9 


72 


14.-08 


3.2 


Jq>an 


9 


99 


14K)7 


3.5 


Korea 


9 


99 


ISKX) 


7.2 


Netheiiands 


9 


99 


1SK>6 


12.5 


Norway 


9 


99 


15:10 


4.0 


PhitippiiKs 


9 


60 


16.-01 


18.9 


Poland 


8 


91 


15.-00 


5.8 


Sing^xnv 


9 


91 


\5m 


9.2 


Sweden (A) 


7 


99 


13:10 


4.8 


Sweden (B) 


8 


99 


14:10 


3.8 


Thailand 


9 


32 


15:04 


8.9 


United States 


9 


99 


15:03 


9.1 



^Mem age in years ami numths. 
'^Stambod deviation in months, 

SOURCE: Data frwn T.N. Pc^eihwaite, Second International Science Study, Vol. II Draft (Hambuiv, July 
1990), 6, 7. 



ERIC 



107 



Table D.IO— Mean age of last-year secondary sample and selected 

schofding characteristic: Second Intemati«mal Sdence Study 





oraoe 


!■ Bill H 1 ■ a MM 




Standaid 


CcNintry 


tKted 




f 

age» 


deviaticm^ 


Ausnaoa 




lO 


17a;3 


11 


\ 4ililmin ^migllSlly 




Do 


loHI3 


11 


JBn&lltlnJl 








7 




12 


41^ 


1 n _r\^ 


7 


iiong Jvong \XOfin 


io 
IZ 


£l 


18:03 


13 


ritMig AOug vTonn / ^ 


11 
13 


iXj 


ly.*iu 


11 


Hungaiy 


12 


183 


ISKX) 


4 


Italy 


12,13 


34 


19*00 


13 


Japan 


12 


633 


18:02 


4 


Korea 


12 


383 


17:09 


8 


Norway 


12 


40 


18K)9 


7 


Pdand 


12 


28 


18K)6 


5 


Singapore 


12,13 


17 


18:01 


8 


Sweden 


12 


284 


19:00 


11 


United States 


12 


83 


17:07 


9 



'Mean age in years and nK>:iUis. 
^Standvil deviation in nKMHhs. 

3CataiJi coiBitries exclutted nxrational &> ' Jents from calcutaticHi of AiKtoits in sclKxri: 



Finland— 63 percent inclusive oi' vocaiional, 4 1 percent exclusive. 

Hmg»y-~40 pocent imluave of vocatkmal, 18 percent exclusive. In Hungaiy, 18 ^mm. <tf tibe 

^ gnMq>are in maSsxm sexxmdsay schot^ suitfying science. Fnty pncoit are ^mially in school 

2spm~-99 percott inclusive of vocaticnal, 63 percent exclusive. 

Korea— 83 percem inclusive of voc^kmal, 38 percent oiclusive. 
^In Sweeten, 90 percent of tfae age group are enroUed in iqjper secondary education (grades 10-12), and 80 
percent complete upper secoralary education (grade 11). Hitneenpenxnt are ouoUed in science trades, and 
IS percent in non-scioice tracks in grade 12; hence, the Second Science calculation of 28 percent 

NCXIE: I^ta in this taMe do» mA precisely cdnci^ with (tea in t^Ie C.4. Table C.4 is based on 
infoimation from iseliminary report, issi^ in 1988. 

SOURCE: Data from T. N. ?os±]bviaiUi, Second /nternaiionai Science Siudy. Vol. 11 Draft (Hamburg. 
July 1990). 6. 7. 



ERIC 



108 

125 



Table D.ll — Mean of last-year secondary rample (biology) 
and percentage of students in school taking biology: 
Second International Science Study 



P^icem of those in school 
Country taking Inology Mean age* 

Australia 
Canada (English) 
Canada (French) 
England 
Hnland 

Hong Kong (form 6) 
Hong Kong (fonn 7) 
Hungary 
Isiael 
Italy 
Japan 
Kc»^ 
Norway 
Fblaiul 
Singapore 
Sweden 
Thailand 
United States 

*Mean age in years and nuxiths. 

SOURCE: Data from T. N. PtosUeUtwaile. Second International Science Study. Vol. 11 Drafl CHambuni 
July 1990). 6. 7. * 



18 




28 




7 

t 


I /ZUZ 


4 




41 


18K)7 


12 


18K)5 


7 


19.-02 


3 


ism 


20 


17.-07 


4 


19:05 


12 


18K)1 


38 


17:11 


4 


18:11 


9 


18.-07 


3 


ism 


5 


18:11 


7 


18:03 


12 


17:05 



10 



Table D.12— Mean age of last-year secondary sample (chemistry) 
and percentage of stud^its In schoi^ taking cheroi^ry: 
Second Inteniattonal Science Study 





Percent of those in school 




Countiy 


taidng chemistxy 


Meanage^ 


Australia 


12 


17.-03 


Canada (English) 


25 


18:04 


Canada (Bench) 


37 


17K)1 


England 


5 


18:00 


Finland 


16 


18K)6 


Uxmg Kong (fonn 6) 


20 


18:04 


Hcmg KcHig (fonn 7) 


12 


19K)3 


Hungary 


1 


18.-01 


Israel 


8 


17K)7 


Italy 


1 


19a)2 


J^>an 


16 


18,-02 


Korea 


37 


17:10 


Norway 


6 


18:11 


Poland 


9 


18:07 


Singapore 


5 


18.-00 


Sweden 


62 


19K)0 


Thailand 


7 


18.-03 


United States 


2 





^Meffii age in years and months. 

2 In Sweden, although only 6 percent of the age group studies chemistry, 13 percent were tested. 



SOURCE: Data from T. N. PosUcihwaiie, S#co/«/ Internaiional Science Study, Vol. II Draft (Hamburg, 
July 1990), 6, 7. 



110 



Table D.13— Mean age of last-year secondary sample (physics) 
and percentage of students In scIknM taking physics: 
Second International Science Study 



Percent of those in school 



Country taking Physics Mean age* 



Au^iaHa n 17K)3 

Canada (English) IS 18,-04 

Oaanda (Rnench) 35 17k)l 

Ens^and 6 igkx) 

Rnland 14 18.-07 

Hong Kong (fonn 6) 20 18:04 

Hong Kong (form 7) 12 19.03 

Hungary 4 igSoo 

Isael 12 17.07 

Italy 13 19.-02 

Japan 11 igo2 

Korea 14 17:11 

Norway 10 18:11 

Poland 9 18K)7 

Singapore 7 18O0 

Sweden 13 19.00 

Thailand 7 1802 

United States l 17:10 



"Mem age in years sotd months. 

SOURCE' Data from T. N. Piostleihwaitc, Second International Science Study. Vol. 11 Draft (Hambrnj, 



Appendix E 



Mean Scores and Confidence Intervals 
for Participating Educational Systems 



ERIC 



113 

123 



Mean scores and ccmftitoice interval for participating educattoaai si 
First iDtematioiial Matbeniatks Study, 13-yearwolds (70 itenis; 





Same 



Federd RqMiblk; of Gennsiy: 243-26^ 



^Eiigland:22.7-24j9 



Scotland: 21.0-23^ 



Netherlands: 20.2-22.6 



France: 19.6-22,4 



Australia: 18.2-19.6 



Kl United states: m-18 J 



Sweden: 143-163 



10 



IS 



20 



25 



30 



35 



40 



Mean Number Correct 




NOTE: Mem scorn tre dem^ by the vertical r I "Ite ^mp!e 93 pe^^ 

men it OaMcd by **)* StalistiGil significanGe itf Mnquriioni to the Unfted Stales it hmd on i 

Boofmcmi-idjusiedi t^ie^ for 11 oomiMrifons with the United States, 

SOURCE- See Appmdix B. 

"5 130 



(m^bemate stn^irts— ^ ttras) 




Israel: 33.8-39.1 



32.9.36w3 



Ranee: 31 J-3S.0 
Nettioiaiids: 30.7-33.1 



ftderalRqMiblkftfGennan 27.8-29.8 



Sweden: 26.0-28.6 



:Sa){]aiid: 24.6-26/1 



Hnland- 24.0-26.6 



■ -•■v ^' ' Aiistralia:a).4-22.9^' 

hH United States: 12.8-14i 



10 



15 



I 

20 



— I- 
25 



— r- 
30 



35 



40 



Mean Number Cmrcct 



ERIC 



NOTE* Nfean Bcores are doiotBd by the bokl venKal C* I 
man » denoted by *0. StadMicil tignifianice ^ 
Bonftnoni-adjtisied t-test toll comparisons widi iIk United Statu. 



SOURCE: Sec Append B. 

116 



131 



Mean scores and confiiieice intervsb fka* iiartidittting educational sy^tmnsi 
i%st fotemtloiial Mi^itmte ^ <^8ecoQdai^ 

(1 





''':^^l|lvRdef^ Rqniilk of Oerau 






. . .' . • .•■ ■■• • ^^^■'\ \/:X:\ 


igland: 20^-22^ i'^-.l5:> 


^i'. ^^^^^ 






'\ ::■<.'■ ■ - 




■ ■■ « ■ ' ■-' * ^' 


HH United states: 7.6-9.0 

* • 


* 
% 
« 

» 



27.1-28.3 



10 



15 20 25 

Mean Number Comet 



30 



35 



40 



NOTE: On Mcoum of missbig dau confidaice Bitavals could lutt be 
NOm Mean Kom are denoted by the bold vaikal C* I 

ine«Dls^3»tedby Cl^^ 1- St*ti*tical iignifkaKc of compMisom to the United Sutei is based on a 
Bonfenoni-acywttd t-teat for 7 oompamons with the UnHed States. 

SOURCE: See Appendix B. 

117 

er|c ^-^^ 



(46 eon itons— arithmetic) 



15 



75 



Same 



15 



25 



35 



45 



55 

4^ 



65 
X 



75 



85 



* * *■ * Ntiiierimds: 57.1^1 J : 



:!~HBc^giiOT (Ftemisfa): 55.7-59.7 



;:Raiicc: 55.3-607 

HH — I Hungary: 53.9-59.7 

« 

KH 1 Belgium (Fiaich): 533-60.5 

Hong Kong: 54.1-56.1 [ 
HH Canada (Ontario): 52.3-56.7 

' United States: 49.1-53.8 

W Scoiland: 49.2-51.2 i 
I I I Israel: 47.0-52.8 ' 
l-H England and Wales: 46.4-50.0 • 

% 

M-l New Zealand: 43.3-47.3 * 
I t r Rnlmd: 43X)-48.0 ^fi^-^- '^:'<r^' 
: /H^ Luxemb(»irg: 44.6-46.2 

/Thailand: 40.6-45.7 ■ ■ : /.'^ ' ^' V ^^r^^^^^^^^^ 
;y:lHN, ■Sweden: 38.8-42.4 ; , 

^Swaziland: 29.6-35.0 ' '-/l'-'' 



T 



T 



35 



45 55 
Mean Percent Correct 



65 



75 



85 



NOm Mem KiHw are dem^ by ihc bold verticd r I The sta^ 
mem is <teu»ed by Suiistkal sjgnifk»H« itf coni^^ 

Bcnfomni-adjusted t-tett for 19 compirisons with the United States. 



SOURCE: See Appendix B. 



118 



ERIC 



133 



Figure £J 

Mean scores and ccmfideiice inte-vate for partidpatiiig educatiimal systons: 
Seamd Inlernatfoi^ Afetteiaatks Stdi^f IS'fear^Ms (^^th grade) 

(30 core itan»— alg^Mra) 



IS 25 35 45 55 65 75 85 





1 f I 1 1 








^ ' Ji'l'i-'l^ BrMCK^J'SM . ^(n- '.^^^•>^^V 








Z"*' ^ • ^ \ \. ^ J' ' ■ ' ••■ •'^-'•-r•*^•V•■■•' 




^ In It t iimmrv* AR UVIA 




I J i Belgium (Fpoich): 45^-53.0 

■, ■ , , . -. .• ... . ■ -. ■ ■. . ■'. ■ ' ' . ■. 

, . . . ' ■ . ...... - ■■ 




. _ ■ ^ ' , ■ . . . , . • • . ■ • . •, ■ ■■ ■. . : ; 




• 




t 

! 1 i 1 Finland: 41.1-46.2 

» 




: HH Hong Kong: 41.6-44.8 




• 

! HH ScoUand: 41.5-44.3 


Same 




as VS. 


1 1 1 United Statec* '^9 8-44 S 




ft 

• III PflfiArlo fnfit«iriA\' AD ^ A*X A 

III \.^na[]a ^unnarioy* *h/.o-*» 

t 




* ^ i J FncrbnA And Wali^' '^S S^l 7 
III dl^ttUtU altU tVoICo. J0*J-^ I » / 

* 




: HH New Zealand: 37.2-41.6 

t 




;|-H Thailand: 35.7-39.7 

t 




I I j Nigeria: 29.1-35.7 ^ 




HH*Sw«feai: 30.7-33.9 ' 






1 " ^ 1 


^/Swariland:'22i-28X) . • - 'J^^^^y^^iSgi 



15 25 35 45 55 65 75 85 

Mem f^-crat Cmrm 

NOI^' Mem scon» are denM«l l>y the boU vertkal r I > The stinpte 95 

rocai i» denoted by ("H"!")- Si i iitt i c e l »igBific«w of c ompariMm toihe Unii«l Steie» U hucdon a 

Bofiforani-edpK&d t-teet for 19 compirisons wiih the United States. 



119 

134 



SOURCE: See AppeiKtix B. 



ERIC 



Mean scores ami ocmfktenoe intervals for partkiisatiiig educstioBal systmsi 
Seomd Intematkmal Matitcmatks SMj^ (^itii graile) 

(39 items gco aMtry) 



IS 25 3S 45 55 6S 75 85 



1 . M 1 1 J ,1 


^. i^.L,..L:,i!.:.J;;...!v....,,.jA.^ 






;^vZi,:^:'\t'fvN-. 






. ■:■ ■■ > - .v; : < \. ■ \ ■ ; ; ;;• >; v':r■:^.^^-:1§s^■^ - - -vvv •,>::■':: ■ • ■ r-. : ■ .* ■ s- v; . * ^•••-N:r>;. ••.••■v.-;i. :•• ; 


■ ■ - 






♦ 144 Canaito(Oniario): 41.844.6 v 




V ^^^^^ 












: 1 1 j BcJgiiini (French): 39.9-45.7 




; • ! 
1 V \ Canada (British Columbia): 40.0-44.7 




L^J Swed«i: 37.8-41.0 

* 


Same 


• 


as vs. 


• 




• 

• 

• naroc: 36.4-39.6 

; • ! 




: HH United States: 36.0-39.6 

* • 




• 

H+H Israel- 33.2-38.6 ' ' 




>-4-^ Swaziland: 28.6-33.6 ; 






thin H" 


-1 Nigeria: 24.6-27.8 : > 


m 




1 


1 1 ' """ ' — r- ' — 1 '" 



15 25 35 45 55 65 75 85 

Mean Pmoit Cwrect 

NOTE' Mem locBet are denoted by ibe bold vaticair I '0*'1^s>n9le9SpeRe^ 
mean danced by ")• &«>isittal signifuamce (tf (wn^^ 
Bonfem»i-ad|ius»d t-tcti for 19 ccin pariiom widi the United Statet , 



120 

135 



SOURCE: See A^midix B. 

ERIC 



Figure E.7 



Second Intmiatiomd MBthamtftes Sti%* 13-year-olib (dgiitli grade) 

(24 €ort items— measorement) 



ERIC 



IS 



45 



55 

I 



65 



75 



85 



•• '"'v'-nr:-^ >/.N^;v:\'; ;i^.v^^../^•^'^ 'AvV:' . f^^^'l'lrancc: 57.7^13 ^ t /V;-;> ; 



jjlji; Hraig Kai: 5li-53/4 v. 

JCanada (British Ccdumbia): 49.4-54 J 

Canada (Ontario): 49^0-52.6 
in Liwanbofflg: 49.3-509 ; 
.•Swalen: 46.7-50.7 ::::.>>v# 
Eogbnd and Wales: 46.8-50.4 ' 
Sc<Mland:47.(M9^^^^^ 
Thailand: 46.1-505 
Israel 44.1-48.8 
New Zealand: 42.9-47.3 

HH United States: 39.0-42.6 



pimibmi: 32.7-37.8 

'''^"'^■•'^■"^;;^^ria:^2«J-32^ 



15 



"T" 
25 



35 



45 



55 



65 



75 



85 



Mean VvrtmA Correct 



NOTE: Meinsa>maiv<knotedbydwboUvatk»lC|'^-Ttesinv>e95pac^ 

mm fa denoiBd by r|-H ^'■l*'**^ *isni&»^ 
Banftnoni-idjittted Mnt br 19 cc m pai bo m with the United Suies. 



SOURCE: See Appendix B. 



121 



136 



F%ureEJ 

Mean scores and confidoice intervals fbr partidpating c d uc a Ho nal systems: 
Second Infematioiiai Mattei^ics Stndy, 13-ycar4ilds grade) 
(18 core itens— descriptive stat&itics) 



15 



25 



Same 



15 



25 



35 



45 



55 



65 



75 



85 



^ o f¥jgr.i i T^l4 toap: 68.0-73.8 
Canada (British Coltunlna): 58.8-635 



Hungary: 57.7-63.1 
; H-l EngbBid and Wales: 58.4-610 
! ScoUand: 58.3-603 
\ — i — I Belgium (Flemish): 553-61.1 

:H-| United Stat^: 55.5-59.9 

J-H FinlaiKJ: 55.4-59.8 

• « 

}-H France: 55.4-59.4 [ 
JHH New Zealand: 55. 1-59 J 

}-H Canada (Ontario): S5.O-S9.0 

• . 

HH Sweden: 54.1-58J i 
fjfj HOTg Kong; 54.7-57.i 
I — I — J Belgium (French): 48.7-55.3 

'/•Israel: 49.4-54 J 
Thailand: 43.3-473 ' 
}fj Luxonbourg; 36.5-38.5 * 
Nigeria: 34 J-39.6* 



Swari^ 32.7-393 



35 45 55 

Mean Pcreent Crarect 



65 



75 



85 



ERIC 



IiKXTE: M6anK«9«8iredm)tedbydieboklvatkdri'0-71ieBin7le9Spenxntc^ 

niMnii denoted by CHH*^- St aiii Hc al Mgnifk«roe rf coi np gfeons to il» United Staiw b based on ■ 

Bonfemmi-adjtttted t-tat for 19 ccsnparisons widi the United Sutes. 



SOUWCE; See Appendix B. 



122 



137 



F%ureE.9 

Mean scores and confldaice Interwib fw paitidiMitiiig Mtucattooal ^ygw: 
Second International Mathenu^ SMy, Last year of secondary school 

(17 items— number systems) 



15 25 




Same 
asU.S. 



35 



45 
-J- 



55 
-I- 



65 



75 
J. 



85 



Hong Kong: 75.1-80.9 



Sweden: 60.4-63.6 



Finland: 54.8-59.2: 



I ji...|^>jcw Zealand: 48^^ 

: Belgium ; 

; I I I ranaHa (Qntario): 45.2-48.8 ^ 
I { I Israel: 43.2^8.9 

* ♦ « 

; |— j j Belgium (French): 41.1-46.9 • 

» • * 

; I I I Canada (British Columbia): 40.5-45.6 

♦ * • 

; l-f-l United States: 37.8-42 J 

• » fc 

; |-.|_| Scotland: 36.8-41.2 j 

% % ft 

Thailand: 30.7-35.4 j * 

Hungary: 25.5-30.6 ^ ^ 



15 



r 
25 



35 



T 



45 55 
Mean Percent Correct 



65 



75 



85 



ERIC 



NOTE: Mc»8aMwaiedeiK>tolbytheboMvenfa»iri*0.TT»simpk 
mem b denoied by ("l-i^ SuuiHicri stpTiciiTO of c»^^ 
Bonfemni-Kljiisted t-te»t to 14 twnpmft:^!, with the United States, 



SOURCE See Appendix B. 



123 



133 



F^ireE.10 




15 



2S 



35 



4S 



55 



6S 



75 



85 




Ksmg'. 153^.7 
76J(m,0 



SvtOem S9A-62J6 »^ 







• ...t^^^^uM-^ ZeaUwd: 54.7-594 



lOuada (Qntaric)): 55iKS9.q 

. |:g|;|| ?;.::BclgiiMn (French): 51 <>-58.1 Igi 
^Scotland: 46.2-49^ ■ - "•^'"^ ' ^ ••>•— ^>-^ 



Same 



I I I Canada (British Columbia): 44.^49.7 

« 

I — 1-^ Hungaiy:4L7-48J 

HH Unlted States: 40.7-45.4 ^ 



15 



25 



} I I Thailand: 35.3-40.7 
1 



-r- 

35 



45 



55 



65 



75 



85 



Mean Percoit CiMrect 



NOTE- MMn»corei«ednw^edbyiheboMvatfcd("|'0.Tl»iimpI«95ix^ 

iRBin ii doMMd by C>H S^'^i'^ si8°^f^!»N« 
Bonforani-Kljiisted t-mt for 14 compnteons widi the United Suees. 



SOURCE: See Appendix B. 

124 




la;? 



F%iireE.ll 

MeM scows and confidwice Intmab ftM- larttelpatog «lucat^ 
SSSSSraSmal Mattonatks Sto^, Last year i^secontoy school 

(26 itons— ^mnetry) 



ERIC 



PiiilaBid; 46^^9.6 

■;' .■• , ♦ ■ . .. ■ ■ ■ ■ .' ' • 

1^ New Zealand: 4t.CM5.0 
* |, i I Belgium (Ftemish): 39.6-44.6 

■*',■"■' ' ■ * 

J«4--^ Belgium (French): 35.540.6 
I y I Isi^l: 32.J-37.9 \ 

HH United States: 29.0-33.0 

Same 

asU.S. Hungary: 27.8-32.2 ; 

h-H Canada (Brilish Columbia): 27.9-314 
[-1-1 Thaiiand: 26.2-29.8 



35 45 55 

Mean Percent Correct 



65 



75 



NOTE: Mem«c««i«d«««edbythebokIverticiirn.'n««invle^ 
if denoted by rhH*^' S**^**^ "8^"*™* *^ 



mean 



BoofBRMd-adjusKd l-te«t 14 wmipirisonf with the United States. 
SOURCE: See AppendU B. 



125 



140 



Figure E.12 



Second Iiit«rmrtloii8lMatlmii^ksStBf|y,l4»t year 1^ ' 



(46 



IS 



2S 
-I* 



35 



45 



55 
4 



65 
4 



75 



85 



63.1-68.9 



Mm 



Same 



Rjibnd: 52^-57.2 
Sweden: 49.4-52^ ) 

• ■ • , . .•■ ► 

New Zealand: 45.8.5oi 
Canada (bniario): 44.0-48.0 
Belgiiun (Fleniisb): 43.8-48.2 
: I t I Israel: 41.9-48.1 
* ^^^^^^ M (French): 40J-45.7 J 

k 

m ' Scotland: 30.2-33.8 

% 

% 

United States: 26.7-31.4 

« 

» 

|-H Thailand: 24.4-27.6 i 

♦ 

|-H Hungry: 23.8-28.2 ! 

^ * 

Canada (British Ctdom 19.0-23.0 



Mm 



T 



15 



25 



35 



45 55 
Mtan Percent Correct 



65 



75 



85 



-74.7 



NOTE: Me«icoi«iredaiotedbyihebridvmiiairi'7.TT»iim^ 
mem If daioied by THH*^. Staiiiticid si^^ 
Bonfenoni-iM^jQned t-tett far 14 canqMrisom with ilw Unittd Swes. 



SOURCE: See Appendix B. 



126 



ERIC 



141 



F%ure E.13 

Mean scores and oonfldenoe {literals for partidpating educational systems: 
Fkst IntonatioBal Sdenee Study, lO^year-olds (40 core items) 




40 



\+\ » Sweden: 17J-19.1 



Same 



l-hH ' Belgium (Fleiiiish): 16.6-19^ 

H United States: 17.1-183 

« 

|-|-| Finland: 16.4-18.6 ; 
M Hungaiy: 16.2-17.2 ' 

% 

[fj Iialy: 15.9-17.1 ^ 



^n^laiid: lS.0-16.4' 




jiim;iigiiiaBii|rt^ 



10 



15 20 25 

Mean Number Correct 



30 



35 



40 



NOTE: &te«Bcamiredc8iotedbydiebaidvmieairi'0'1^sin^9Sp(reemc«afi^ 
niea it dawiDd by C'l^'O. Aalittkd rifnifictnoe itf oonq^^ 
Bonfemd-M^jnsied t-iMt for 11 comperiwin witfi the Untol St^s. 



SOURCB: See Appendix B. 



127 



li2 



F%ureE.14 

Mean scores and amffitace intervals fbrpartidpating educational igrstems: 
First International Sctence Stady, 14-3Fear-€lds (80 cm ttons) 



5 



10 



IS 

-JL. 



20 - 

1. I - 



25 



30 



35 



40 



Same 



Unver 
ilMaUJ. 



10 



Oqjffiu 29j6-32.8 
|#j;Himgaiy:28.4-29A 

|;Auslnilia''23.T.25i;;i^ 
New Zealamk 22.8-25.6 

, .... ,^ . . . : ^ .. . ^ ' ■, ,. • ^. ,^ , . ^ \ 

luteal Rqmblk of Gennany: 22.6-24.8 
! (-1-1 Sweden: 20.6-22^ 

I hH United States: 20.9-223 

t 1 I Scotland: 19.9-22.9 

ft 

ft 

England: 20. 1-22 J 

ft 

I; i I Belgium (Flemish): 19.6-22.8 

ft 

ft 

ft 

\^ Finland: 19^-21.5 
}|) Italy: 18.0-19.1 > 

. % ■ .■ ■ , . ■ 

■ ■ : ■ ' . 

^fcUwriands: 16.5-19.1 > 

■ •, • ■ . ■ » , ■ ■ , -■. , 

, '. ■ ' ■ ■ ■ ; ■ ■ _ '■ ■. ■, 

: {" ' .;) ' ^jlj 'i j : ; B^um' (French): 1 3^ 17^ ■ .^^ 



'I ' " ' 1 ' ' 1"" 

15 20 25 

Mean Namber Correct 



30 



35 



40 



NOTE: Men icnes m itencKed by dKboM vertical ("I 11k rinq^ 95 pmxntc^ 
Bonfemni-kljiisled i-tett for 13 compsisoiu with the United Stties. 



SOURCE: See Appoidix B. 



Figure E.15 

Mean scores and ccRifidrace intervals for partidpating educatkmal systons: 
First Iiiterimtk»ai Sdoice Study, Ui^ jw of «a»ilary sd^ 




^codand: 21 



Engtajd: 210-242 llilllii 




Same 
as VS. 



Sweden: 18.3-20.1 : > 

■ ■ ■ ■■ ' ' • '.' , ■ •.' ■ ' *' ' ■/ 

■■■■ '-■'■''.,■,''*, ■ "■'■■i'. 

Be^uin(FIeniish): 15.6-17.6 

« 

M Italy: 15.6-162 i 

m 

f-4-| Belgium (1 rcnch): 142-16.4 • 

« 

+H United states: 11.7-15^ 



10 



15 



20 



25 



30 



— T" 

35 



40 



Meao Number Cwrcct 



: 26^27.8 



NOTE: 11» iMm seem is doK^ by dwbi^ vertical r I The am^coi^^ 
«toioiBdlyr|^%Intervab«c 95ft caifiitencc inters 

Uiited Sma uing Bonfenmi «djiuied \-\eti for IScomjnmcns widi Ae United Stana. 



SOURCE- See Appendix B. 



ERIC 



129 



Hi 



Figure E.16 

Mean scores and confidence intern^ for iMUtidiiating educatkmal systems! 
Secfflid Internatkmal Sdoice Study, 10-year-idds (24 con U«aa) 




iiHungaiy: 14.0-14.9 



Same 



||| Canada (English): 13^-14.0 
|-H Italy; 12.9-13.9 ; 

H United States: iz9-13.6 

^ Australia: 12.6-13.2 ! 



|-|-| Norway: 12.1-13.3 




M«8B Nomber Comet 



erJc 



NOTE: McffiMXftf «re(lauKBdbytheboMvatiGaI("0.11teHinide 

nicn b deiK^ by rhH*^- ^8>>ific*n<^ 
Bonfonmi-ttljustMi t-tesi for 14 vampmstmt with the United States, 



SOURCE: See Aiqmidix B, 



130 



145 



Figure E.17 

Mean scores and confid^ice intervals fbr partidpstiiig educatkmal systems: 
Second Intemattonai Sdence Study, 14-year-olds (aOcore ttons) 



10 

I 



Same 
as VS. 



15 



20 



25 



^\:" ■>v•■■•■•;■■^;^"'^^''^■■■■^;^'■• ■ ^^^^^ 



: 212-722 



2aO-20.4 
* NeA^^ 19.3-20.3 



Sweden: 18.6-18.8 



ttllKOTea: 17^18.3 

V :.■> .- .v.. ■ .. ■' ;>..■ . 

FblaiHl: 17.7-18.5 




; 1+1 England: 16.3-17,1 
! |-H Italy: 16.2-17.2 

; |-H Singapore: 16.(M7.1 

: hH United states: 16.0-17.0 

• H ThailaiKl: 16.1-16.9 

• • 

! |-H Hong Koiig: 15*9^16.9 

• * 

;:;|^;'l%ilippincs: 11.1-1 15 /tl-^S.^^^^ 



T 



10 IS 
Mean Nnnber Correct 



20 



25 



NOTE: MentcareiaiedaiotedbydiebcMvertied("|*0'11>«>i°q'te9Sp^^ 
Bonfeirom-aiyttf led t-tett for 1 6 oompvisant with the \Muad States. 



SOURCE: See Appendix B. 



ERIC 



131 



146 



Figure E.I8 

Mean sexarts and ccmfift^ice inlerv^ f^* partldpatiiig educatloiia] syrens: 
Second Internationa ScSeroe Stady, LmI year irf secondary gcfaooi 

(3dcorettm»— bii^igy) 



KoQg (fcrai 7): 55 J-56J 




England: 62.9- 63.9 



59.0^.4 



JH Sw«dwK48!l-49X) 1 
■|j-;;AiisttBlia:47i-48;5 

I iltaiy: 40.3-44 J 

IH United States*: 37.6-38.6 



1 



15 



25 



35 45 



55 



65 



'United Suaea test one 23 items; Ansinlia 29 itent; all oiher ooonuks 30 iieau. 

N01E: OaacoNimaf ntifRngdittconfidcsipenitavalgcoukl 

NOTE: Mm scoiwiTO denoted tiy die toUvaiieal(" 1*0. The sinqde 95 
iM« is denoted by Sl«tistK«l siflnificvioe (tf con^ 

Boi^Emai-a^jimBd t-tett 12 oomiwinra with the IMted 



SOURCE: See Appendix B. 



ERIC 



132 



147 



F%ure E.19 

xores and amfldcnce intervals fSw iMrtidpating educational 
Second Intemtifmal Scfesce tedy, 1^ year itfseooniiai7 sci^ 

(30 core iteni^-ciiaiibtry) 



15 



25 



Same 
a8U.S. 



15 



25 



35 



45 



55 



65 



75 



85 



^ ^ T)- 76^-77.8 



68.9-70. 




lift VoDg ^ai^i^ 6): ^^.2 



^J^KUi: 50.2-53.6 

* * . 

; III AusrBlia:46.^^ 

m Norway: 4L2-42.6 

* • 

: ^ Sweden: 39.5-40i 
I I \ Italy: 35.2-40.8 [ 

^ • 

: HH United States*: 36.4-39.0 

•)f| Canada (English): 36J-37.5 

* • 

tFmland: 32.8-33.8 * 



"1 ' " ' i ' \ ' T 
35 45 55 65 

Men Vtnmt Cc^nei 



75 



85 



'United States test can 25 items; all otto' countries 30 itans. 

NOTE: QoicGoumof nuBfmgdauconlkiciasimavabcoukli»tbecak»lated to 

NOTE: Ifeintoirei are denoted by ttielxddveiticairi*^T%etia^ 95 poco^ 
inem is deooted by "O. Stttitficd tifnificnice of oonvarisflot t^ 
BoafiRn»s*adji»ted t-test for 12 comparifooi with the United States. 



SOURCE: See Appendix B. 



ERIC 



133 



143 



Mean acores and confidtMe intmib for ptrtidiMitiwg fdacaftenal syrtCTa; 
Second Inlmattenal SdeM ^«ly, 141^ mr of se^^ 

(30 ooTf Hems-— piijBcs) 



(iDim7):68.6-7U 




i in ' Ausnaliii: *iM92 



Same 

asVJS. 



\i\ United States*: 44.5^.5 



Swe(kn: 44.4-452 



Fhilsiid: 37^4.38.4 



,.\';-.nc ; '. ■>n>1 




•;itaiy: 27^-28^ ' . : ^ v^, 



T 



T 



IS 



25 



35 



45 55 65 

Mean FterccBt Correct 



75 



85 



*lMt(d Sutcs teit cm 25 items; Canada (EnglislO 26 hn^ 
NOTE: (^acooumt^misfiBgdauoonfidtnMiitfavaUeoa^ 

NOm Memtoare«andaioledtyihebiMv«Rkiairi'^1te«iii9i89Spa«emc^ 
ingMi l» denoted bvry^n.Siaiitti«aliIyiiiBe«iMrf« ^ 

BonioToni^Kyo^ ^lm for 12 epoqwiwu widi die th^ 



SOURCB.- See ^pendix B. 



ERIC 



134 



149 



Figure E21 

Mean scores and coofitleiice intervals paitidoatiiig educatioiial systens: 
Intenutiiiiiid Aswasnem of Eitec^ioml Ani0^ 
(63 i tems maUiCTiatks ^rviidmcf) 



ERIC 



200 



250 
4 



300 
JL 



350 
i 



400 
4 



4S0 

JL 



soo 

JL 



550 



600 



^,mim^i^!^^>>^'.---.-...h ^ J ' ... , I " i i.nl. . 1 , „ , 1 nj „„^, „ .j 



iCorea:56W-573.1 



5536.9-549.1 

1 

(Biidsh Columbia): 



iil Canada (Quebec: English): 
531.9-539.7 



Canada (New Bninswick: 
English): 523.9-534.1 

:apiiiiij 

^ tJli Canada (Ontano: English): 
^ ™ 510.0-522.2 



I Ui Canada (New Brunswick: 
> "^ ;F^^ 507.7-52a7 



Spain: 502.7-52a7 
United Kingdom: 503.0-516.8 
^htton± 497.1-5115 



Same 
asU^ 



H Canada (Ontario: Rcnch): 
476.2-486.8 

HH United States: 465.1-482.7 



» » 1 1 1 r- 

200 250 300 350 400 450 500 

Mean Pronciency Seem 



— I— 

550 



.-I 



600 



NOm Me«i«>n«iicdenoiedlyihel»ld%mi^ 

me«a b denoiod liy ("1^ 'T. Stttiidcd ligniflamce cf corapnto^ 
Bonfenoiii-atyBsled t-iM for 1 1 oonqiait^ 



SOURCE- See i^jpeodix B. 



135 



150 



Figure E^2 

Mean scores and confidence intervals for participating educational systems: 
Intemalifma] Assessmmt oi Educatkmal Progrera, 13-yearwolds 
(60 items— «ci«ice proficiency) 



200 



250 
_JL. 



300 



3S0 



400 
I 



450 

-JL. 



500 
-4- 



550 



600 



Same 
asU.S. 



200 



250 



"I" 
300 



Canada (British Columbia): 
547.2-555.4 



Korea: 544^-555.6 



m 

United Kingdom: 512.3-526.8 

^ Canada (Quebec: English): 
: 509.8-520.8 

V Ul Canada (Ontario: English): 
I * ' 509.4-520.0 

* L|J Canada (Qt^iec: French): 
506.9-519.9 

Canada (New Bniuswick: 
English): 505.2-515.8 

1^ Sinin:495J-512.: 

HH United States: 469.1-487.9 

» 

» 

Ireland: 462.4-476.2 



||j Canada (Ontario: French): 



464.0472.6 



I I I Canada (New Bninswidc: 
~ 460J-475.7 



'raicb): 



-I i 1 1 r- 

350 400 450 500 550 

Mean Profideocy Score 



600 



NOTE: MemKiwireikiiatedbyaiel»idv«t}cdri*O Tl»«n^ 
mm n tented by 1. ftatiA^ iigiufwaim 
Bflnfcmni-idjitMed t*iect for 11 oompiriioRi with tl» Ihiiied States. 



SOURCE: See ^>peodixB. 



136 



ERIC 



151 



United States 
Department of Education 
Washington, D.C. 20208-5650 



Official Business 
Penalty for Private Use, $300 



FHxtage and Fees Paid 
U.S. Depwtmsmof EAicaten 
ParmilNo.O-17 



FOURTH CLASS BOOK RATE 



ERIC 




152 



